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ABSTRACT 


1 This  project  report  summarizes  the  bandwidth  compression 
research  activities  performed  by  the  University  of  Kansas 
under  contract  number  F33-61 5-74-R- 1093  with  the  Air  Force 
Avionics  Laboratory  at  Wright  Patterson  Air  Force  Base,  Ohio. 

The  primary  purpose  of  this  study  is  to  investigate  the 
feasibility  of  video  bandwidth  compression  in  the  order  of 
50:1.  This  compression  was  simulated  using  imagery  digitized 
to  64  grey  levels  (6  bits).  This  task  was  accomplished  by 
three  steps.  These  steps  are:  transform  compression,  frame 

rate  reduction,  and  Differential  Pulse  Code  Modulation  (DPCM) . 
Since  the  transform  compression  scheme  is  vital  to  the  success 
of  this  project,  major  emphasis  was  placed  on  this  area. 
Results  indicate  that  a 50:1  compression  is  feasible  and  that 
the  best  transforms  to  be  utilized  in  a hybrid  manner  with 
DPCM  is  the  Discrete  Cosine.  Additional  research  developments 
led  to  a fast  implementation  of  the  Karhunen  Loeve  transform 
which  is  optimum  under  root  mean  square  error  criteria.  Com- 
parisons of  other  fast  transform  performance  is  made  as  well 
as  an  optimum  bit  coding  scheme. 
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In  the  selection  of  the  subject  of  this  analysis,  Bandwidth 
Compression  of  Video  Signals  from  Remotely-Piloted  Vehicles 
( RPV  Systems) , consideration  was  given  to  the  importance  of 
the  topic  itself  to  the  Air  Force  and  Avionics  Community. 

Many  Air  Force  vehicles  and  satellites  act  as  remote  sensing 
platforms  for  collecting  weather  data,  reconnaissance,  earth 
resources  studies  and  television  coverage  of  the  moon's  sur- 
face. The  Armed  Services,  and  especially  the  Air  Force,  are 
tending  toward  an  all-digital  aircraft.  These  future  plans 
rely  upon  the  technological  advancements  and  research  time 
required  to  solve  many  problems  associated  with  digital 
avionics.  All  of  these  applications  may  be  analyzed  by  the 
use  of  spectral  and/or  spatial  data  of  remotely  sensed  scenes. 

It  is  of  prime  importance  to  consider  not  only  what  data  can 
be  collected,  but  what  data  must  be  transmitted,  at  what  rate 
the  transmission  must  take  place,  and  the  resolution  require- 
ments of  the  reproduced  data. 

These  several  applications  have  many  common  problems. 

The  vast  amounts  of  data  require  large  bandwidth  systems  or 
long  transmission  times  at  much  narrower  bandwidths,  excessive 
memory  constraints  and  excessive  analysis  time.  Standard 
commercial  TV  frame  rates  of  30  frames/sec.  result  in  a 
nominal  4.5  MHz  analog  bandwidth  with  a transmission  rate  of 
about  20  Megabits/sec.  With  the  cost-driven  system  constraints, 
it  is  imperative  that  only  minimum  information  be  transmitted. 

The  redundancy  within  the  sensed  environment  must  be  capitalized 
upon  and  the  size  and  weight  constraints  of  computerized  systems 
must  be  held  to  an  absolute  minimum.  Applications  of  appropriate 
bandwidth  compression  techniques  can  have  a major  impact  in  cost 
and  efficiency  of  transmission  systems  for  the  Air  Force.  As 
applied  to  RPV  systems,  two  major  considerations  must  be  in- 
vestigated: First,  the  reconnaissance  role  in  which  high 

resolution  data  must  be  preserved,  and  second,  the  near  real 
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time  transmission  problem  under  flight.  In  the  former,  it 
is  imperative  that  the  information  content  and  resolution 
requirements  be  maintained  but  may  not  require  rapid  anal- 
ysis, while  memory  storage  requirements  are  the  forcing 
function.  In  the  latter,  near  real  time  suggests  < relaxa- 
tion of  resolution  constraints  but  transmission  speed  and 
simplicity  are  of  prime  concern.  The  need  to  define  these 
transform  parameters  and  their  limitations  is  emphasized  by 
the  Control  Data  Retrieval  Systems  Working  Group  at  Wright- 
Patterson  AFB. 
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SECTION  I 

TRANSFORM  IMAGE  DATA  COMPRESSION 

1 .  INTRODUCTION 

In  the  design  of  image  coding  systems  for  digital  trans- 
mission of  images  one  is  faced  with  providing  a method  which 
can  be  reasonably  implemented  to  minimize  the  number  of  digital 
bits  used  for  transmission  and  keep  distortions  of  the  origi- 
nal scene  within  a pre-determined  fidelity  criterion.  With 
the  power  of  digital  computers  and  the  introduction  of  fast 
transforms,  the  transform  of  the  image  is  encoded  rather  than 
the  image  itself.  This  form  of  data  requires  long  trans- 
mission times  at  a fixed  bit  rate  determined  by  the  design. 

It  is  therefore  advantageous  to  reduce  or  compress  the  data 
while  preserving  the  information  content.  In  remote  sensing 
applications  from  satellites  or  remotely  piloted  vehicles,  the 
data  contains  a large  amount  of  redundant  information  due  to 
high  correlation  of  graytones  of  spatially  adjacent  resolution 
cells.  It  is  this  property  which  permits  bandwidth  compression 
of  digital  images.  It  is  therefore  desirable  in  terms  of  trans- 
mission time,  required  storage,  and  as  a preventative  to  signal 
jamming  when  used  with  spread  spectrum  techniques  to  achieve 
bandwidth  compression. 

Briefly,  bandwidth  compression  schemes  fall  into  three 
categories [ 1 ] : 

1.  Statistical  models 

2.  Psychovisual  models 

3.  Transform  techniques 

The  statistical  models  appear  at  first  glance  to  incorporate 
the  natural  randomness  of  data.  However,  actual  probabilities 
of  data  points  within  some  general  set  are  not  often  known 
accurately.  Mean  and  variance  are  gross  measures  and  do  not 
lend  themselves  easily  to  picture  coding  in  terms  of  recon- 
naissance value.  Schrieber  has  investigated  third  order 
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statistics  but  sufficient  data  have  not  yet  been  obtained 
[1].  The  psychovisual  approach  accounts  for  the  inability 
of  the  receiver,  the  human  eye,  to  detect  signals  beyond 
certain  gray  limits  while  being  slightly  affected  by  high 
frequency  phenomenon  even  though  the  magnitude  of  such 
disturbance  may  be  low.  Again,  however,  modeling  the  eye  or 
the  human  responses  is  not  complete  and  is  plagued  with  non- 
linearities  still  to  be  accounted  for.  The  transform  tech- 
nique appears  to  offer  an  approach  which  can  include  both 
effects  of  the  above  and  lend  itself  to  digital  modeling  [1]. 

This  research  applies  the  transform  technique  to  bandwidth 
compression  of  remotely  piloted  vehicles.  In  this  report,  the 
concepts  of  bandwidth  compression  are  developed.  The  per- 
formance criteria  is  presented  with  the  procedure  for  the 
Karhunen-Loeve  transform  which  is  optimum  under  the  mean 
square  error  criteria.  As  part  of  the  conducted  research,  a 
fast  Karhunen-Loeve  procedure  is  developed.  The  assumptions 
for  the  separability  of  this  transform  are  presented  with 
data  demonstrating  the  optimality  when  compared  to  other  fast 
transforms.  With  regard  to  the  coding  for  compression,  three 
procedures  are  used:  the  notch  filter,  energy  compression, 

and  a variable  word  length  code.  The  difference  between  the 
fast  transforms  which  are  feasible  for  interframe  processing 
are  discussed  with  the  conclusion  that  with  this  variable  word 
length  code  the  Discrete  Cosine  is  the  best  performer.  How 
this  selection  was  made  is  reviewed  in  light  of  the  established 
criteria  and  sequency  properties  of  the  fast  transforms. 

The  feasibility  of  a system  compression  of  50:1  is  then 
investigated.  This  is  done  by  the  application  of  the  "best" 
transform  with  Differential  Pulse  Code  Modulation  (DPCM)  in 
the  frequency  domain  between  frames.  An  additional  analysis 
of  the  differences  which  contribute  new  information  for  up- 
dating has  been  completed. 
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This  research  therefore  offers  evidence  of  the  feasi- 


bility of  system  compression  of  50:1  and  the  errors  incurred 
in  this  process  as  well  as  complete  computer  documentation  of 
programs  directly  generated  for  the  compression  scheme. 

2 . Review  of  the  Literature 

The  principles  of  redundancy  reduction,  or  data  compression, 
nave  surprisingly,  been  known  for  some  time.  Its  application, 
in  image  data  compression  has,  over  the  past  several  years, 
sparked  intense  investigation.  As  early  as  1958,  Good  [2] 
described  a technique  of  eliminating  redundancy  in  the  matrix 
transform  by  matrix  factorization.  Fourier  Analysis  and 
other  orthogonal  transforms  were  well  defined  but  decomposi- 
tion was  not  to  have  its  impact  until  1965.  Cooley  and 
Tukey  [3]  reported  a computationally  efficient  means  to  calcu- 
late the  complex  Fourier  coefficients.  With  the  power  of 
digital  computing  the  door  was  now  open  to  image  transmission 
and  the  investigation  into  comparative  approaches.  By  March 
1967,  The  Proceedings  of  the  IEEE  presented  a collection  of 
articles  under  a "Special  Issue  on  Redundancy  Reduction". 

AndreW^*and  Pratt  [4]  used  an  approach  which  treated  the 
Fourier  Transform  of  complete  pictures.  Anderson  and  Huang 
[5]  followed  with  an  adaptive  version.  In  this  approach  the 
image  was  divided  into  a number  of  subimages  and  the  co- 
efficients selected  for  transmission  was  based  on  a linear 
relationship  to  the  standard  deviation.  This  became  known  as 
the  Piecewise  Fourier  method  and  when  used  with  adaptive  thres- 
holding reportedly  performed  well  in  the  presence  of  noise. 

The  discrete  Fourier  transform  for  an  image  is  given  by 
Andrews  [6]  and  other  texts  as: 
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The  inverse  Fourier  for  reconstruction  is: 

N-l  N-l  > 
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F(u,v)exp  (ux+vy)j> 


Many  researchers  have  transmitted  the  complete  Fourier  trans- 
form of  the  picture.  This  particular  process  begins  by  the 
partitioning  of  an  N x N picture  into  (N/n)  (n  x n)  sub- 
pictures where  n<<N.  Having  sampled  the  image  in  terms  of 
brightness  the  discrete  Fourier  transform  of  each  sub-picture 
is  expanded  into  its  frequency  coefficients  and  an  adaptive 
threshold  procedure  is  applied. 

Huang  [5]  defines  a quantity  L which  is  linearly  propor- 
tional to  the  integer  part  of  o the  second  moment  of  the 
sub-picture  defined  as: 
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a is  the  ensemble  average  which  is  used  as  an  estimate  of  the 

true  mean  and  given  by: 
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B(m,n)  is  brightness  values  of  the  two-dimensional  sub-picture. 
L coefficients  are  transmitted  with  this  method.  To  express 
the  Fourier  transform  as  a matrix  let  W = exp  (— ^-)  then 
[F]  = [A]  [f]  [A] . Since  the  Fourier  is  a symmetrical  and 

separable  matrix: 
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The  Hadamard  Approach: 

The  Hadamard  transform  is  based  on  the  Walsh  functions. 
Walsh  functions  can  be  generated  by  first  writing  the  binary 
representation  of  a word,  converting  to  a gray  code,  and  then 
multiplying  the  Rademacher  functions  represented  by  a 1 and 
leaving  out  those  represented  by  a zero  [7].  Walsh  function 
properties  are  defined  by  Lackey  [7].  The  matrix  equation  is 
a good  way  to  represent  the  operation  with  Walsh  functions. 

If  [X]  is  a column  vector  of  our  sampled  brightness  values, 

[W]  a matrix  of  Walsh  functions,  then  the  Walsh  coefficients 
are  given  by  [A]  = [W]  • [X] . Since  the  Walsh  matrix  can  be 

classified  as  separable  and  symmetric,  the  Walsh  matrix  is 
its  own  inverse  with  a constant  multiplier  N.  i.e., 

[X]  = i • [W]  • [A] 

[W]  • [W]  • ^ = [I] 

The  Hadamard  matrix  has  this  same  property  [8]: 


[H]  [H]fc  = N [ I ] 


or  for  symmetric  case  [H]  [ H ] = N[I]. 

t orde 
[6]  is  [H]  = 


The  lowest  ordered  Hadamard  matrix  defined  by  Pratt  & Andrews 
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Constructions  for  nearly  all  values  of  order  N exist  up  to 
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2<N<200 . 
then  G = 


If  N = 2 
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n = integer,  H is  a matrix  of  order  N 
is  a Hadamard  matrix  of  order  2N  [6].  The  fre- 


quency representation  of  the  transform  is  given  by  its 
sequency  coined  by  Harmuth.  Sequency  designates  the  number 
of  sign  changes  as  in  a Rademacher  function.  The  two- 
dimensional  matrix  representation  of  the  transform  pair  is 
given  by: 


[F(u,v)  = (H(u,v)]  (f(x,y)]  [H(u,v)] 
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where  V^,  i = 1,  2,  ....  r are  the  basis  vectors 


where  X.;  is  the  reconstructed  value  of  X.,  and  Y?  is  the 
J Ik 

quantitized  value  of  the  coefficients. 

The  algorithm  for  generating  the  basis  vectors  is  given 
by  Haralick  et  al.  [9].  This  transform  has  a fast  implemen- 
tation as  well.  This  is  accomplished  by  treating  the  two- 

2 

dimensional  subimage  as  a one-dimensional  n data  vector  and 
the  basis  set  as  the  direct  product  of  two  lower  spaces,  i.e., 
a 16  x 16  subimage  is  viewed  as  a 256  x 1 data  vector  and 
transformed  by  the  direct  product  of  two  (16  x 16)  basis  sets. 
The  first  operates  in  the  rows  and  the  second  operates  in  the 
columns . 

The  Slant  Transform: 

In  1972,  The  Proceedings  of  IEEE,  in  a Symposium  on 
Walsh  Functions  published  an  extensive  collection  of  papers  on 
image  applications.  A highly  publicized  application  by 
Enomoto  and  Shibatu  [10]  of  Japan  had  been  extended  to  a 
general  fast  transform  [11].  This  orthogonal,  "Slant  Tvans- 
form",  basis  is  a discrete  sawtooth  waveform  decreasing  in 
uniform  steps  over  its  length.  The  advantage  is  that  this 
type  of  basis  utilizes  the  gradual  brightness  changes  in  an 
image  line.  The  general  decomposition  form  of  this  matrix 
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The  Discrete  Cosine: 

In  more  recent  work  in  development  of  fast  transforms 
for  unique  compression,  Ahmad  et  al.  [12]  defined  the  Dis- 
crete Cosine  and  demonstrated  how  it  could  be  implemented 
using  the  fast  Fourier  Transform.  This  work,  developed  the 
Cosine  basis  set  as  a class  of  Chebyshev  polynomials. 

The  basis  is  given  by  letting  m=0 , 1 , . . . ,N-1 : 


n/2 


cos 


( 2m+l ) kir 

16 
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where  N is  the  size  of  the  subimage.  Ahmed  compares  these 
results  with  the  eigenvectors  of  a correlation  matrix  with 
first  order  Markov  properties.  The  implication  using  this 
basis  is  that  it  closely  approximates  the  Karhunen-Loeve . 
Shanmugam  [13]  pointed  out  that  this  is  not  surprising 
since  the  Karhunen-Loeve  basis  vectors  and  Cosine  basis 
are  asymptotically  equivalent  as  dimensionality  gets  large. 
Haralick  [14]  has  since  introduced  a storage  efficient  way 
to  implement  the  Discrete  Cosine. 


Singular  Value  Decomposition; 

Andrews  et  al.  [23]  has  suggested  that  Outer  Product 

Expansion  methods  may  be  applied  to  bandwidth  reduction.  It 

is  suggested  that  the  concept  of  eigenvalue  map,  condition 

number,  and  rank  of  an  image  are  then  useful  guides  to 

computer  storage  and  savings.  Andrews  investigates  the 

expansion  or  decomposition  of  the  image  by  considering  it 

2 

a matrix,  G,  which  is  an  n matrix  and  digitized  such  that 
the  l row  and  j column  correspond  to  the  x and  y spatial 
coordinates  of  a scene.  Now  any  matrix  (in  this  case  G)  may 
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be  represented  by: 

G = UDV1 

where  U and  V are  arbitrary  unitary  matrices  and  D is  a 
matrix  comprised  of  the  coefficients  of  expansion  of  G. 

The  elements  of  D are: 

d. . = G v.  u.  and  v.  being  the  columns 

13  1 1 1 1 

of  U,  V respectively. 

Now  if  D can  be  diagonalized  with  suitable  matrices  U and 
V the  expansion  we  are  left  with  is  known  as  a singular 
value  decomposition.  Given  that  D will  have  r diagonal 
values  the  matrix  G can  be  represented  by: 

r - -t 
G = T.  d . u . v 

i=l  111 

The  d.  terms  are  known  as  singular  values  of  G and  u. , v 
1 11 

are  the  corresponding  singular  vectors.  To  choose  the 
appropriate  values  for  U and  V 

GGt  = UAUfc 

anc*  GfcG  = VAVfc  where  A is  the  diagonal  eigen- 

values of  GGt. 


Then  d.  (singular  values)  will  be  the  square  roots  of  the 

1 t 

eigenvalues  of  G G.  By  ordering  the  eigenvalues  in  des- 
cending order  the  smallest  mean  square  error  is  achieved. 

The  eigenvalue  map  is  the  plot  of  singular  values  as 
a function  ordered  index  r.  Andrews  then  defines  the 
condition  number  as: 


C (G) 
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d . 
nun 


The  compression  relations  now  becomes  apparent.  For  the 
case  of  retaining  k outer  products  we  approximate  the 
image  by: 
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where  the  image  was  defined  as  rank  r. 

This  approach  to  compression  has  very  many  interesting 
possibilities.  As  an  example  the  first  and  second  eigen- 
images  contain  high  and  low  frequencies.  This  is  not 
available  in  the  standard  transform  approach. 

The  generation  of  basis  sets  and  two-dimensional  trans- 
forms is  not  the  complete  story  in  image  compression.  Once 
an  image  has  been  transformed  it  must  be  coded  for  the  de- 
sired compression  and  transmission  channel.  Numerous  papers 
have  been  produced  on  this  facet  of  the  compression  applica- 
tion. The  most  noteworthy  are  those  by  Max,  [15],  Hibibi 
[16]  and  Wintz  and  Tasto  [17].  These  approaches  investigate 
the  means  of  quantizing  coefficients  of  the  transformed 
image  and  the  effects  of  simulated  channel  errors.  In 
very  general  terms  they  may  be  classed  into  three  categories 
fixed,  variable  word  length  and  adaptive.  Fixed  infers  that 
the  same  number  of  bits  will  be  used  for  coding  each  co- 
efficient of  each  subimage.  Variable  length  codes  assigns 
a different  number  of  bits  to  each  coefficient  of  any  one 
subimage,  but  the  same  coefficient  of  two  different  sub- 
images will  be  coded  identically.  These  schemes  are  adap- 
tive from  image  to  image.  The  adaptive  systems  assigns 
codes  in  an  "instantaneous"  manner.  This  affords  coding 
changes  in  a line  to  line  basis  [18].  This  presumes  a 
method  of  segmenting  the  image  into  classes  or  areas.  The 
coding  used  in  this  research  is  an  adaption  of  the  variable 
length  codes.  A fixed  number  of  bits  is  assigned  to  each 
subimage  and  the  distribution  is  dependent  on  the  variance 
of  the  coefficients  (see  Chapter  3)  . 
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All  of  these  transform  methods  have  been  utilized  with 
single  frame,  monochromatic  film.  It  is  a natural  extension 
and  a necessary  one  to  consider  frame- to- frame  problems.  A 
collection  of  papers  in  the  inter-frame  subject  can  be  found 
in  Huang  and  Tretiak  [19].  The  interframe  redundancy  reduc- 
tion is  possible  because  of  the  high  correlation  between  ad- 
jacent frames.  Since  the  background  information  does  not 
change  significantly  only  a few  picture  elements  are  con- 
tributing new  information.  The  transforms  discussed  pre- 
viously are  being  applied  to  this  problem  by  the  University 
of  Southern  California  in  three  dimensions  [20] . Estimates 
of  a 5:1  reduction  in  the  number  of  bits  needed  for  recon- 
struction are  being  made. 

The  transform  domain  coefficients  are  also  being  combined 
with  Differential  Pulse  Code  Modulation  (DPCM) . This  approach 
is  not  entirely  new.  Haralick  [21]  and  Habibi  [22]  applied 
DPCM  internal  to  an  image  (intra-frame)  to  achieve  an  addi- 
tional compression  of  1.5  to  2.0  bits/picture  element.  The 
interframe  DPCM  presumes  a means  of  storing  at  least  one 
frame  and  the  image  difference  should  require  less  bit  alloca- 
tions for  coding.  This  is  generally  true  if  the  scenes  are 
not  flat  (same  average  gray  level).  In  these  methods  a thres- 
hold of  change  is  decided  upon  (not  necessarily  defined)  and 
a change  is  noticed  if  above  the  threshold.  This  new  in- 
formation is  then  transmitted. 

A disadvantage  of  the  three-dimensional  transforms  is 
storage  requirements,  while  the  DPCM  is  relatively  simple 
and  requires  only  one  frame  be  stored.  The  conditional 
change  (threshold)  methods  generate  codes  at  uneven  rates 
depending  on  type  of  motion  requiring  the  necessary  buffers 
which  will  limit  the  motion  toleration. 

The  references  on  transforms,  coding,  and  the  interframe 
problem  although  numerous  have  not  solved  the  problem  of 
optimizing  image  transmission.  They  have,  however,  sparked 
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the  interest  of  many  researchers,  each  contributing  his  own 
unique  insight  and,  therefore,  creating  a better  understanding 
of  system  possibilities.  Since  this  research  deals  with 
transform  image  compression,  those  research  articles  applied 
in  this  report  have  been  emphasized. 

3 . Definition  of  the  Objective 

It  is  the  objective  of  this  study  to  evaluate  transform 
bandwidth  compression  techniques  for  removing  the  spatial 
redundancy  of  typical  photography  and  television  sensors  used 
on  the  Remotely  Piloted  Vehicle  Systems.  This  study  should 
further  establish  the  feasibility  of  a system  compression 
of  50:1.  The  goal  of  50:1  compression  ratio  will  be  achieved 
by  three  stages  : 

1.  Transform  compression  of  12:1 

2.  Differential  Pulse  Code  Modulation  (DPCM) 
to  yield  a compression  of  1.5:1 

3.  Frame  rate  reduction  of  3:1 

In  evaluating  the  possible  transform  compression  schemes,  the 


"best 

" transform  will  be  utilized  in  the  hybrid  (transform/ 

DPCM) 

approach . 

Specific  transforms  under  analysis  are: 

1 . 

Hadamard 

5. 

Discrete  Cosine 

2. 

Fourier 

6 . 

Fast  Karhunen-Loeve 

3. 

SLANT 

7 . 

Principal  Components 

4 . 

DLB 

4. 

Image  Data 

Compression 

In  the  search  for  efficient  transform  and  coding  schemes 


orthogonal  transformations  with  their  decomposition  properties, 
and  matrix  theory  provide  powerful  tools  within  the  digital 
environment.  In  finite  dimensional  vector  space  orthogonal 
transformations  may  be  viewed  as  linear  operators  from  one 
space  to  another  which  preserve  inner  products.  By  the  analysis 
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of  the  eigenvalues  and  corresponding  eigenvectors  an  optimum 
solution  may  be  found  in  the  mean  squared  error  sense  when 
based  on  the  second  order  statistics  of  the  image.  Other 
orthogonal  transformations  may  then  be  compared  to  the 
optimum  and  may  be  sufficiently  accurate  and  considerably 
easier  to  implement.  In  this  study,  the  optimum  solution 
based  on  eigenvalues  and  eigenvectors  has  been  considered 
as  well  as  several  other  leading  orthogonal  transformation 
schemes.  These  approaches  and  their  comparison  to  the 
optimum  solution  is  considered  in  Chapter  4.  But  first,  we 
must  understand  how  compression  of  an  image  is  accomplished. 
Transform  image  data  compression  is  accomplished  by  squeezing 
the  image  through  a small  dimensional  hole.  The  squeezed 
image  is  the  compressed  image  and  what  results  after  having 
passed  the  image  through  the  small  dimensional  hole  is  the  re- 
constructed image.  More  precisely,  given  an  M x M digital 
image  obtained  by  an  analog  to  digital  conversion  process, 
the  image  is  considered  to  be  divided  into  a set  of  non- 
overlapping subimages  (or  windows)  of  size  n x n;  n<M.  If  each 

2 

n x n subimage  is  interpreted  as  a vector  in  an  n dimensional 

2 

space,  where  each  of  the  n coordinates  correspond  to  one  of 
2 

the  n picture  elements  (PELS)  in  the  subimages,  then  the 

digital  image  can  be  represented  as  a collection  of  K vectors 

2 2 

each  of  dimension  n ; K = (M/n)  . (See  Figure  1.1) 

2 

Now  let  N denote  the  dimension  of  each  data  vector,  (N=n  ) , 
and  x^,x2,...,xk  be  the  collection  of  all  data  vectors  from 
the  K non-overlapping  windows  in  the  subimage.  The  transform 
compression  is  achieved  by  the  projection  of  these  N-tuples 
or  N-dimensional  data  vectors  onto  an  R dimensional  subspace 
using  a set  of  R,  N-dimensional  basis  vectors  v. ,v,,... ,v_. 

It  is  obvious  that  in  order  to  compress  R must  be  less  than 
N. 
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The  iw‘‘  projection  of  the  data  vector  x^  is  given  by 
^ x,<  =(\t  vi2''*''viN^  t^ie  transPosc  °f 
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basis  vector.  The  R projections  of  each  x^  are 


the  1 

coded  independently  and  transmitted  over  a binary  channel  to 
the  receiver.  At  the  receiver,  the  reconstruction  takes 
place  by  decoding  the  values  of  y. . and  approximating  x. 


with  it.  according  to: 
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or  in  matrix  notation: 
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where  c^  is  the  vector  of  R coefficients  for  any  jth  vector. 
Of  course,  sometime  during  the  transmission  the  basis  vectors 
must  be  transmitted  or  known  beforehand.  The  total  squared 
error  of  reconstruction  is: 


2 _ 


N 

E 

n=l 


X -X 
n n 
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5 . The  Nature  of  Fast  Transforms 

Given  a set  of  orthogonal  basis  vectors  a fast  algorithm 

is  based  on  being  able  to  decompose  this  matrix  of  basis 

vectors  into  several  matrices  where  each  row  operation  is 

less  than  what  is  required  by  the  direct  transform.  Consider 

for  a moment  a one-dimensional  transform.  Given  a basis  set 

we  can  transform  the  data  vector  x of  dimension  N to  another 

N dimensional  vector  whose  components  are  the  coordinates  of 

the  vector  x in  terms  of  the  given  basis  set. 

Let  v,,v_,...,v  be  the  given  basis  set.  v-  is  a column 
1 2 n i 

vector . 


Let  V = 

are  the  given  basis 
x is  given  by: 


be  a matrix  whose  rows 


n ' 

The  transform  c of  the  vector 


C = VX 

When  the  basis  set  is  orthonormal,  the  inverse  transform  is 

given  by  X=V’C.  This  type  of  transform  done  in  a brute  force 

2 

manner  would  require-  n operations  for  a n x n matrix  V.  The 

fast  transform  implementation  permi-ts  the  transform  to  be 

done  in  2n  log  n operations. 

A fast  transform  implementation  exists  whenever  the 

matrix  V can  be  represented  in  a special  way  as  a product 

of,  say  K square  matrices,  each  square  matrix  having  only  2 

non-zero  elements  per  row.  When  represented  in  this  manner, 

k k 

the  matrix  V,  which  has  2 rows  and  2 columns,  can  multiply 
a vector  and  require  only  2k  • 2 operations.  The  square 
matrices  are  often  represented  in  a flow  diagram  with  each 
layer  in  the  diagram  representing  the  multiplication  by 
one  square  matrix. 


If. 


The  fast  algorithm  is  based  on  the  principal  that  the 
basis  vectors  of  Rn  can  be  represented  by  taking  the  direct 
product  of  the  basis  vectors  of  two  lower  dimensions.  Let 
Rn  be  formed  by  the  direct  product  of  R^  and  R^  such  that 
n = pq.  Let  U = (Uj_  ,u  2 , . . . ,up)  and  V = (vj_  * v2  , . . . , v ) be 
two  vectors  of  dimension  p and  q respectively.  Then  we  can 
define  a direct  product  vector  of  dimension  pq  by: 


UV= (u1v1 ,u1v2 , . 


’UlVq'U2Vl'U2V2' 


’ U2Vq 1 


,u  V. ,u  v_  , 

p 1 p 2 


. u v ) 

P q 


Now  if  we  have  a set  of  vectors  {U 1 ,U _ , . . . ,U  } which  are 

d ^ P 

orthogonal  basis  for  RF  and  a set  of  vectors  {V  ..  ,V  2 , . . . ,V  } 

Q 

which  are  an  orthogonal  basis  for  RM  then  we  can  define  a 
set  of  product  vectors  {iTV  | i=l , 2 , . . . , p , j = l,2,...q}  where 
each  product  vector  is  defined  as  above.  As  an  example  of  a 
simple  transform,  Figure  1.2  shows  the  representation  used 
for  the  more  general  case.  For  the  forward  transform  case; 
let  HD (i) , i=l,2,...,n  denote  the  dimension  of  each  ortho- 


gonal basis  set: 


n 

N = 7i  HD  ( i ) 
i=l 


The  whole  transformation  is  divided  into  n layers,  one  for 
each  orthogonal  basis  set.  Figure  1.3  is  a pictorial  re- 
presentation of  such  a scheme  with  n=2;  and,  two  basis  sets 
comprised  of: 


Now  each  layer  will  have  N boxes, 

basis  vector  of  the  corresponding 

If  Mul(j)  = l HD  ( i ) the  jth 
i=l 


each  box  representing  a 
basis . 

layer  will  be  comprised  of 


N/MUL ( j ) sets  of  boxes.  Each  set  contains  MUL(j-l)  boxes 
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II lustrat i on  of  a 2 X 2 transform  defined  by  I A 1 I = ( 1 

\X2/  \ /3] 


Thus,  for  example,  the  data  point,  Xj 
and  the  data  point, / leading  tooc2 
summed  to  produce  X^ . 


, leading  to  oC^,  is  multiplied  by 
is  multiplied  by  oc^.  The  results  are  then 


Figure  1 .2a . 


II  lustration  of  a 


2X2  transform  defined  by  ^ A ^ J = ^ / 


We 


consider  ~ (a|  ot2^  an<^  a7=  @2^  * 0 row 


basis  vectors.  Xi  is  then  the  projection  of  X 
projection  of  X on  02 . 


- Xl 


on  Oj  and  X2  is  the 


Figure  1.2b. 


8 


■ an 


8 Dimensional  F 


?n 


all  having  the  same  basis  vector  repeated  MUL(j-])  times  in 

t h 

order.  Each  box  in  the  j layer  has  HD(j)  inputs  with  one 

output  which  is  the  dot  product  of  the  input  components  and 

the  basis  vector.  If  x(k),  k=l,...,N  is  the  input  to  the 

t h 

layer  then  the  input  to  the  b box  is  given  as  follows. 

The  starting  index  for  each  distinct  basis  vector  of  a 
set  within  the  layer  is  the  number  of  the  first  box  in  that 
set  and  is  incremented  by  one  for  each  repetition  of  that 
basis  vector.  Within  each  box  the  indexes  of  the  input  com- 
ponents are  incremented  by  MUL(j-l)  where  M(o)=l.  Haralick 
et  al.  [24]  has  shown  that  this  scheme  can  be  further  simpli- 
fied as  in  Figure  1.4.  In  this  case  each  box  in  each  layer 
is  alike  and  represents  all  the  vectors  of  the  basis  set.  In 
this  simplification  each  transform  has  an  equal  number  of 
input  and  outputs  of  the  same  indexes.  Each  output  is  the 
dot  product  of  the  components  input  and  the  corresponding 

basis  vector.  There  are ' further  N/HD(i)  transforms  in  the 
t hi 

i layer.  Haralick  has  defined  the  indexing  of  the  input 

i L. 

component  as  follows:  "In  the  j layer  the  indexing  of  the 

input  components  for  the  kfc^  transform  starts  from 
( K — 1 ) * HD ( j ) + 1 if  MOD  (K-l,  MUL(j-l))  = 0,  if  not,  the 
previous  index  is  incremented  by  1.  Within  the  transform 
the  indexes  go  by  MUL ( j- 1 ) . " ^ ^ This  forward  transform  is 
equivalent  to  obtaining  the  direct  product  of  and 
given  previously  to  form  TQ  an  8 x 8 orthoqonal  basis  set 

O 

defined  by  Tg=H2  x D4  = { t1 , t2 , t3 , t4 , t& , tg , t? , tg  } where 
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Figure  1.4.  Illustration  of  the  Indexing  Scheme  for  a N=8  Fast 

Transform.  In  the  First  Layer  There  are  Two  2(4x4) 
Transforms,  in  the  Second  Layer  There  are  4(2x2)  for 
the  Decomposition  of  N=8=4*2.  The  Numbers  Labeling 
the  Input  Indicate  What  Position  the  Data  Points  are 
Coming  From.  The  Numbers  in  the  Output  Side  Label 
the  Input  Positions  for  the  Next  Layer. 
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Let  T be  a matrix  whose  rows  are  t^,...,tg,  respectively.  T 
is  the  matrix  transforming  a vector  into  the  coordinate  system 
of  Tg.  This  scheme  can  be  applied  to  the  inverse  operation 
as  well.  Cleary  TT'=D,  where  D is  a diagonal  matrix. 
Multiplying  both  sides  by  D , TT ' D =1.  Thus,  the  inverse 
of  T is  easily  formed  as  T ^=T'D  ^ . Therefore  if  T is  the 
square  matrix  of  order  n whose  rows  form  the  orthogonal  basis 
set  and  are  the  normed  squares  of  the  basis 

vectors,  then  the  inverse  definition  is  equal  to: 


The  columns  of  the  matrix  T ^ will  be  the  basis  vectors  for 
the  co- responding  layer  in  the  inverse  operation  scheme. 
Figure  1.5  depicts  the  inverse  layer  operation  which  is  the 
mirror  image  of  the  forward  transform.  The  indexing  scheme 


13 


is  as  follows:  let  HD ( i ) =1 , . . . , n denoting  the  dimension  of 

each  inverse  orthogonal  basis. 

n 

N = v HD(i) 

H=1 

j 

MU  LI  (j)  = 7i  HD  ( i ) 
i = l 

Then,  also  given  by  Haralick  the  indexing  of  "the  input 
components  for  the  layer  and  transform  starts  from 

( K- 1 ) * HD ( j ) + 1 if  MOD ( ( K- 1 ) , MULI ( j + 1 ) ) =0 . If  not,  the  pre- 
vious starting  index  is  incremented  by  1.  Within  the  trans- 
form the  indexinq  goes  by  MULD  (j+l)."[24] 

6 - The  Error  Criteria 

The  basic  problem  of  transform  coding  is  to  determine 
what  basis  vectors  are  to  be  used  in  the  projection  operation. 
Crucial  in  such  a choice  is  our  criteria  of  best.  In  this 
particular  program  several  indexes  of  performance  have  been 
used  for  evaluation  of  the  imagery.  These  measures  of  per- 
formance are:  1)  visual  comparison  of  reconstructed  images, 

2)  (RMS)  the  root-mean-square  error  between  the  original 
and  reconstructed,  3)  correlated  RMS  error. 

In  visual  comparison  the  cosmetic  qualities  of  the 
reconstructed  image  is  compared  to  the  original.  Those  major 
effects  of  the  reconstructed  image  to  be  looked  for  are 
blocking,  loss  in  resolution  and  contouring.  Blockings  is 
caused  by  reconstructing  the  image  with  an  incomplete  set  of 
frequency  components.  Some  of  the  components  were  set  to 
zero  in  the  coding  process.  If  a poor  choice  of  components 
has  been  used  this  will  show  up  as  discontinuities  in  gray 
level  at  the  boundaries  of  the  suLimage  used  in  the  parti- 
tioning of  the  original  image.  This  will  also  happen  if 
insufficient  bits  are  used  in  the  coding  scheme  for 


reconstruction.  Loss  in  resolution  is  a spreading  of  bound- 
aries noticable  generally  around  man-made  geometric  objects 
or  a fuzzy  image.  Contouring  is  the  appearance  of  boundaries 
within  the  image  rather  than  gradual  gray  level  changes.  This 
is  generally  caused  by  insufficient  bits  to  reproduce  the 
dynamic  range  of  the  gray  levels  within  an  image. 

The  RMS  error  has  been  accepted  as  a quality  value  in 
judging  the  reconstructed  images.  Although  a poor  criteria 
in  general  with  regard  to  images  it  does  provide  a figure  of 
merit  of  the  errors  which  have  accumulated  in  the  transform- 
coding-reconstruction process.  The  root-mean-square  error 
criterion  may  be  stated  as  follows: 

h N 

RMS  Error  = \/±2  E E [ I (i , j ) -K ( i , j ) ] 2 

V i=l  j=l 

where  I(i,j)  represent  elements  of  the  original  image 

and  K(i,j)  represent  elements  of  the  reconstructed  image. 

2 2 

In  this  study  images  with  (256)  elements  and  (512)  elements 
were  used.  The  false  impression  of  RMS  error  can  be  seen 
if  an  image  has  been  enhanced  in  any  manner  during  the  process. 
The  RMS  error  will  be  higher  but  the  image  may  be  more  pleasing 
to  the  observer.  This  criteria  must  therefore  be  used  in 
combination  with  other  criteria.  In  images  where  no  en- 
hancement or  non-linear  process  has  been  involved  a large 
RMS  error  generally  appears  with  poorer  images. 

Combining  these  two  measures  of  fidelity  with  a measure 
of  spatial  correlation  of  errors  yields  a more  complete 
idea  of  performance  of  transform  compression.  The  correlated 
RMS  error  is  formed  in  the  following  manner. 

The  gray  values  of  the  reconstructed  images  are  rounded 
off  to  integers,  the  error  image  between  original  and  recon- 
structed is  obtained,  and  the  RMS  error  computed  from  the 
error  picture.  The  correlated  RMS  error  is  the  product  of 


the  RMS  error  times  a spatial  correlation  measure  obtained 
from  the  error  image.  This  correlation  measure,  is  a mea- 
sure of  spatial  neighborly  dependence  of  the  error  within 
the  image.  The  computation  is  as  follows: 


Let  Z = {1,...,N  } and 
x x 

Z = {1,...,N  } be  the  set  of  x and  y spatial 
^ ^ domains  of  the  error  picture. 

(row  and  columns  indexes  respectively) 

Let  S = {s^s^-.-s^}  be  the  set  of  gray  tones  in  the  error 

image.  We  first  compute  the  relative  frequency  of  occurrence 

of  two  neighboring  cells  separated  by  a distance  whose  radius 

is  r,  one  with  gray  tone  s^  and  the  other  with  gray  tone 

value  s . . 

1 

I:  Z xZ  -*  S 

x y 

Let  R be  a binary  relation  defining  the  specified  spatial 
relationship  of  any  two  resolution  cells. 


R D (Z  xZ  ) x (Z  xZ  ) defined  by 
— x y x y 1 

R = { ( (a,b)  , (c,d) ) | p ( (a,b)  , (c,d) ) = r} 

where  o is  a metric  on  Z xZ  . 

x y 

The  joint  probability  density  function  may  then  be  defined  by 

#{ ( (a,b) , (c,d) ) cR | I(a,b)-S.,I(c,d)  = S.} 

P ( S . S.)  = J— 

1 3 # R 


# denotes  the  number  of  elements  in  the  set. 


The  marginal  probability  is  given  by: 
# { (n ,m) e (ZxxZy) 


P (S 


I (n,m)  = Si) 


#(ZxxZy) 


Now  if  we  let  P(S^,S^)  denote  tiie  normalized  gray  tone  tran- 
sition matrices  then  the  correlation  measure  is  defined  by 
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The  average  mutual  information 


the  following  equations: 
for  two  dimensions  is: 


The  source  entropy  is 

h = Z P(K)  log2  P (K) 

K 

P (K)  = £ P ( S . , K)  = £ P ( K , S . ) 

c 1 q 1 

1 i 

n C 

and  P = p 

The  average  correlation  measure  contains  information  about 
the  nature  of  the  spatial  distrioution  of  the  error  picture 
and  the  RMS  give  a measure  of  the  magnitude  of  the  error 
between  original  and  reconstructed.  As  an  example  a random 
Gaussian  noise  error  picture  was  generated.  The  correlation 
function  was  generated  with  lag  1,2,5  (distance  of  cell 
separation)  and  the  correlation  was  0 as  anticipated.  See 
Fiaure  1.5. 

7 Principal  Components  - An  Optimum  Transform  in  Terms 
of  RMS  Error  Criteria 

In  view  of  the  fact  that  all  transform  compression 
schemes  should  be  compared  to  the  optimum,  and  we  have 
developed  a two-dimensional  fast  Karhunen-Loeve  (principal 
components)  as  part  of  this  study,  the  rincipal  components 
approach  should  be  included  as  an  integral  part  of  background 
information  necessary  to  the  understanding  of  compression 
techniques.  The  K-L  transform  is  optimum  in  the  sense  of 
minimum  mean  square  error.  In  applying  the  K-L  transform 
for  data  compression,  the  M x M original  image  is  split  up 
into  K non-overlapping  subimages  of  size  n x n,  where  n<M, 


2R 


Correlation 

R(r) 


Lag  Dist.  R (r) 

1 1. 03  x 10'5 

2 2. 06  x 10"6 

3 1. 67  x 10'6 


Figure  1.5. 


Correlation  function,  Gaussian  random  noise 
image . 


22 


2 

windows.  The  elements  of  the  n x n subimage  form  the  n 
elements  of  the  vectors  or  windows.  To  do  a K-L  transform, 
the  first  step  is  to  form  the  autocovariance  which  is  defined 
by: 

t = E { (x-m) (x-m) 1 } = E{xx' } - mm' 


m = E {xl  and  x is  the  vector  of  gray  levels  in  a 


window . 

2 2 

The  autocovariance  matrix  will  be  an  n x n matrix. 


The 


second  step  is  to  find  the  eigenvalues  and  corresponding 
eigenvectors  of  this  autocovariance  matrix.  The  matrix  of 


eigenvectors,  call  it  T, 
matrix. 


I\ 


will  diaganolize  the  covariance 


<D 


T'{  T = 


<t> 


w 


Let  V be  the  matrix  whose  R rows  are  the  eigenvectors  with 

2 2 

the  M largest  eigenvalues,  of  the  n x n covariance  matrix. 
The  K-L  transform  of  a vector  x is  defined  by  Y = V(x-m) . 
Compression  is  achieved  by  the  fact  that  R<n.  The  inverse 
transformation  is  defined  by  x = v'y+m.  The  total  squared 
error  of  this  process  is  the  sum  of  the  (m-n)  smallest 
eigenvalues.  Now  by  transforming  the  original  random  vector 
to  the  vector  of  principal  components  amounts  to  rotating  of 
the  coordinate  system  to  a new  system  with  inherent  sta- 
tistical properties.  In  terms  of  variance  the  first  prin- 
cipal components  is  the  normalized  linear  combinations  of 
the  original  random  variable  with  maximum  variance,  the 
second  is  the  normalized  linear  combination  having  a variance 
greater  than  all  others  except  the  first  and  so  on.  The 
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coefficients  generated  by  this  method  are  not  only  uncor- 
related but  statistically  independent.  This  is  of  particular 
importance  in  the  image  processing  scheme  since  the  coeffi- 
cients energy  contribution  is  based  on  the  variance  of  the 
coefficient  for  a zero  mean  image.  In  fact,  the  orthogonal 
transformation  using  the  eigenvectors  leaves  invariant  the 
generalized  variance  and  the  sum  of  the  variances  of  the 
components.  This  theorem  is  proved  in  Anderson  [5]  and  repeated 
here  as  support  for  the  above  claim. 

Theorem  1 - An  orthogonal  transformation  of  a random  vector 
x leaves  invariant  the  generalized  variance  and  the  sum 
of  the  variances  of  the  components. 

Let  V be  the  transformation  matrix  whose  R rows  are 
the  eigenvectors  of  the  covariance  matrix  of  dim.  n.,  i.e., 

V:  Rn  -*■  Rn 

C (vector  of  components)  = VX 
E {x}  = 0 and  E {xx'J  = f 

then  E {c}  = 0 and  E { cc 1 } = vv' 

The  generalized  variance  of  C is: 

Ivfv*  I = | v I • Itl  • |v*  I = |£||  VV|  = \t\ 

which  is  the  generalized  variance  of  x. 

The  sum  of  the  variances  of  the  components  of  C is 

1 E { C ^ } 2 = trace (V$V' ) = hr(fV'V)  = tr(£l) 

= EE{x2 } 

l 

In  bandwidth  compression  (dimensionality  reduction)  we  seek 
a function  f:  Rn >Rrn  where  Rm  is  a smaller  subspace  of  Rn  such 
that  the  total  squared  error  ( e 2 ) is  minimized. 
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K 

min  E 
f k=l 


<V£<\>>' 


<V£IV> 


Principal  components  states  this  minimum  is  achieved  with  an 
orthogonal  projection  operator  projecting  into  the  subspace 
spanned  by  the  M largest  eigenvectors  corresponding  to  the  M 
largest  eigenvalues  of  the  second  moment  matrix.  To  reiterate 
the  total  squared  error  is  given  by  the  (M-N)  smallest  eigen- 
values . 

Since  this  procedure  yields  the  minimum  error  in  terms 
of  RMS  criteria  our  investigation  of  orthogonal  operators 
for  bandwidth  compression  would  be  enhanced  if  a faster 
computational  process  approaching  the  minimum  error  could  be 
found.  In  fact,  this  is  possible,  and  discussed  in  the 
following  chapter  prior  to  discussing  our  results. 
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I 

SECTION  II 

DEVELOPMENT  OF  A FAST  TWO-DIMENSIONAL, 

I KARHUNEN-LOEVE  TRANSFORM 

l,  INTRODUCTION 

The  purpose  of  transform  coding  is  to  store  or  re- 
present data  in  a reduced  dimensional  space  and  yet  pre- 
serve the  data  structure.  When  mean  square  error  is  the 
optimality  criterion,  the  principal  components  or  Karhunen- 
Loeve  expansion  is  the  best.  However,  the  principal  com- 
ponents technique  is  generally  not  used  in  transform  coding 
image  compression  work  because  of  its  computational  com- 
plexity. In  this  document  we  describe  how  to  implement  the 
principal  components  or  Karhunen-Loeve  transform  (KLT)  as 
a fast  transform  for  image  data  compression  when  the  image 
data  satisfies  mild  stationarity  and  isotropic  conditions. 

1 Figure  2.1  illustrates  the  typical  way  transform 

coding  is  done.  The  N^  row  and  Nc  columns  image  is  par- 
titioned into  subimages  or  windows  each  having  rows  and 

K columns.  There  are  (N  /K  )'  (N  /K  ) such  subimages, 
c r r c c 

Each  subimage  is  transformed  using  a Hadamard,  Fourier, 

Slant,  Discrete  Cosine,  or  Discrete  Linear  Basis  transform 
(all  of  which  have  fast  implementations)  or  by  the  principal 
components  or  Karhunen-Loeve  transform  (which  is  slow) . Those 
transform  domain  components  which  have  the  highest  energy  or 
variance  are  quantized  and  stored  or  transmitted  while  those 
transform  domain  components  which  have  lowest  energy  or 
variance  are  not  retained  and  effectively  set  to  zero.  Data 
compression  is  achieved  because  the  number  of  bits  required 
I t to  encode  the  highest  energy  transform  components  are  much 

less  than  the  number  of  bits  required  to  encode  the  data  in 
its  original  spatial  form. 

In  order  to  achieve  the  data  compression  by  the  princi- 
pal components  technique,  the  gray  levels  in  each  of  the 


33 


34 


(N  /Kr) * (Nc/Kc)  subimages  must  be  arranged  as  a vector  and 

the  (K  K ) x (K  K ) auto-covariance  matrix  for  a sample  frac- 
r c r c 

tion  f of  these  vectors  must  be  computed.  This  requires: 


/ N \ /N  \ 
I _£  I I c ) 

\ K / \ K / 

' r ' ' c ' 


<KrKc! 


= fN  N K K 
r c r c 


operations.  Next  the  eigenvectors  and  eigenvalues  of  the 
auto-covariance  matrix  must  be  found.  This  requires  on  tne 


operations.  To  take  the  dot  product  of  each 


order  of  (K  K ) 
r c 

subimage  with  K K vectors  to  obtain  the  transformed  image 
r c 2 

requires  (N^/K^)  (Nc/Kc)  (KrKc)  operations.  The  fast  trans- 
form technique  described  here  in  will  only  require: 

(K 


(N  /K  ) (N  /K  ) 
r'  r c c 


V 


(K  +K  ) 
r c 


operations.  This  represents  a savings  factor  of  K^Kc/ (Kr+Kc) . 

To  begin  our  discussion  of  how  a fast  KLT  may  be  developed, 
we  must  first  discuss  the  general  form  of  the  auto-covariance 
and  the  kinds  of  matrices  which  may  be  diagonalized  by  fast 
transforms.  This  leads  us  to  composite  matrix  theory.  With 
this  background,  mild  stationarity  and  isotropic  assumptions 
will  lead  to  the  fast  implementation  by  yielding  a composite 
matrix  which  has  a direct  product  form. 


2 . Forms  of  Matrices  Which  Fast  Transforms  Diagonalize 

When  dealing  with  the  mean  squared  error  criteria  the 
optimal  transform  technique  is  the  principal  components,  or 
by  another  name  the  Karhunen  Loeve  transform  (KLT) . Assuming 
the  mean  subimage  has  zero  mean,  the  basis  vectors  of  this 
transformation  are  given  by  the  eigenvectors  of  the  auto- 
covariance matrix  for  the  set  of  subimages  into  which  the 
image  is  partitioned.  Thus,  in  order  to  do  a good  job  of  data 
compression  we  must  select  a fast  transform  technique  whose 
basis  vectors  are  the  eigenvectors  of  matrices  similar  to  the 
form  of  the  auto-covariance  matrix  of  the  image. 
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To  begin  with,  some  examples  (Figure  2.2)  are  presented 
which  are  alike  in  the  sense  that  the  eigenvectors  depend  only 
on  the  form  of  the  matrix  and  not  on  any  of  the  values  of  the 
elements  of  the  matrix  might  take.  Notice  that  in  each  of 
these  examples,  the  eigenvalue,  corresponding  to  an  eigen- 
vector is  easily  obtained  as  the  dot  product  of  the  eigen- 
vector with  the  first  row  of  the  matrix,  e.g., 


I1  1 1 A 

1 1/3  "1/3  -1 

1-1-1  1 | 


b + 

c 

+ 

b/3  - 

c/3  - 

b 

c 

+ 

3/b  + 

3c 

This  occurs  because  the  first  components  of  each  eigenvector 
is  1 . 


From  this  small  set  of  2nd,  3rd,  and  4th  order  matrix 
eigenvector  forms  we  can  construct  larger  order  matrix-eigen- 
vector forms  in  a way  consistent  with  the  fast  transform 
technique.  Consider  for  example  a composite  matrix  of  the 
form 


/H1 

A = H2 

\H3 

where  H^,  H2»  and  all  have 
commute)  and  are  of  the  form 


HrVH3  H2 

H_  H.  / 

2 1 / 

the  same  eigenvectors 


( they 


The  eigenvectors  of  A are  of  the  form  x*y,  the  direct 
product  of 
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Figure  2.2.  Examples  of  matrices  in  which  eigenvectors 
depend  only  on  form  of  matrix. 


17 


x= 


an  e'genvector  of  a matrix  of  the  form 


b 

a-b+c 

b 


with 


an  eigenvector  of  a matrix  of  the  form 

(:  :) 

See  Afriat,  (1954)  [1],  and  Williamson,  (1931)  [2] 
must  be  so  is  easily  illustrated  in  our  example. 


XjHjV  4 x,ll,y  ♦ *1",y 


3 3 7 


x.H.y  ♦ x,(H.  - H « H )y  * x II  y 


Xjfijy  4 x,n}y  4 x,n,y 


2 2' 


3 3' 


■ Xjn^y  4 Xjlrijy  - n?y  4 njy)  4 x3n2y 
\xjnjy  4 ,2n2y  4 Xjl^y  / 


*ln2  * XJ*nl  " n2  * nj*  * x3n'* 


• y 


1 


Why  this 
Consider 
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But  since  x is  an  eigenvector  of  any  matrix  of  the  form 


r 

b 

C\ 

hi  n2 

n3\ 

b 

a-b+c 

b 

1 n2  nl"n2+n3 

n2 

X = X 

b 

ai 

\n3  n2 

V 

Hence, 

A (x- 

y)  = [Ax]  • y = (x  * y ) . 

Thus , 

if  y 

is  an  eigenvector  of  , H„ , 

and 

so  that 

v = 

r^y. 

V 

= n^y,  and  H^y  = n^y  and 

x is 

an  eigenvector 

of 


B = 


n. 

n. 

n 

1 

2 

n„ 

n,  -n„+n.. 

n 

2 

12  3 

n 

n,. 

n 

3 

2 

the  matrix  B having  the  same  form  as  A except  that  the  corres- 
ponding eigenvalues  appear  in  place  of  the  H matrices,  and  x 
has  corresponding  eigenvalue  A,  then  x*y  is  an  eigenvector  of 
A,  with  eigenvalue  A.  In  this  example  the  eigenvectors  of 
A are 


l\ 

l\ 

i\ , 

M 

J 

/ 1 1 

1 1 

M 

i 

f-l' 

' i 
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1 1 

i| 

0 I 

0 

-2 

-2 

1 1 
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-2 
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1 / 
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l'1/ 

I"1! 

1 'j 

[ i 

^ rH  

Vi 

VI 
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VI 

w 

and  they  correspond  to  the  fast  transform  implemented  as  the 
direct  product  of  the  vectors  in 

j)  (i)  (I 

with  those  in 


30 
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The  matrix  A has  the  general  form 

~a  b c d e 

b a d c f 

c d (a-c+e) (b-d+f)  c 

d c (b-d  + f)  (a-c+e)  d 

e f c d a 

f e d c b 


fl 

e 

d 

c 

b 

a_ 


and  has  eigenvalues  corresponding  to  the  6 listed  eigenvectors 
given  respectively  by 


(a+b+c+d+e+f ) , (a-b+c-d+e-f ) , (a+b-c-f ) 
(a-b-e+f ) , (a+b-2c-2d+e+f ) , and  (a-b-2c+2d+e-f ) . 


■\ . Stationarity  and  Isotropicity 

In  general,  however,  our  auto-covariance  matrix  will  not 
have  the  previous  discussed  form  without  assumptions  of 
stationarity  and  isotropicity.  [ 3]  Suppose  an  image  I had 
rows  and  Nc  columns.  Partition  this  image  into  mutually 
exclusive  subimages  each  K rows  by  Kc  columns.  (We  assume 
Nr  is  an  interger  multiple  of  Kr  and  Nc  is  an  interger  multiple 
of  Kc.)  We  say  the  image  I is  stationary  (in  the  weak  sense) 
when  two  conditions  are  satisfied: 

(1)  The  mean  gray  tone  of  all  resolution  cells  situated 
in  subimage  row  column  coordinates  (i,j)  is  the  same 
constant  u which  is  independent  of  the  relative  sub- 

I 

image  coordinates  (i,j). 

(2)  The  gray  tone  covariance  of  all  pairs  of  resolution 

cells  situated  in  subimage  row  column  coordinates 
l(i,j),  (i+n^,  j+n2)]  is  a function  ofn^n^  only  of 

the  row  column  translations  (n^n2)  and  independent 
of  the  relative  subimage  coordinates  (i,j). 
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As  illustrated  in  Figure  2.3,  condition  (1)  implies,  for 
example,  that  the  average  gray  tone  of  all  resolution  cells 
occupying  the  upper  left  hand  corner  of  the  subimages  equals 
the  average  of  all  resolution  cells  occupying  the  upper  right 
hand  corner  of  the  subimages.  In  other  words,  fix  a relative 
coordinate  of  the  subimage.  Then  the  average  gray  tone  taken 
over  all  resolution  cells  having  those  relative  coordinates  in 
the  subimages  is  equal  to  the  average  taken  over  any  other 
relative  coordinates. 

As  illustrated  in  Figure  2.4,  condition  (2)  implies,  for 
example,  that  the  average  second  moment  gray  tone  taken  between 
resolution  cells  occupying  the  first  and  second  columns  of  the 
first  row  in  each  subimage  must  equal  the  average  second  moment 
gray  tone  taken  between  resolution  cells  occupying  the  fifth 
and  sixth  columns  of  the  chird  row  in  each  subimage.  In  other 
words,  fix  some  relative  coordinates  of  the  subimage.  Then 
choose  a row  column  translation.  Then  the  second  moment  gray 
tone  taken  between  all  resolution  cells  situated  in  the  specified 
relative  coordinates  and  in  the  specified  relative  coordinates 
shifted  by  the  row  and  column  translation  is  independent  of  the 
specified  relative  coordinates  and  only  a function  of  the  trans- 
lation factor. 

An  image  I is  isotropic  if  it  is  stationary  and  if  the  co- 
variance  depends  only  on  the  spatial  distance  of  the  transla- 
tions. Thus,  for  example,  the  covariance  for  a shift  of  one 
row  down  and  two  columns  over  is  not  necessarily  equal  to  the 
covariance  for  a shift  of  two  rows  down  and  one  column  over 
for  a stationary  image,  but  since  they  both  represent  trans- 
lation of  distance  J 5 they  would  be  equal  for  an  isotropic 

image. 

In  more  mathematical  terms,  let 
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Fiqure  2.4.  Illustrates  that  in  stationary  images  the 
second  moment  statistics  taken  over  all 
resolution  cells  marked  in  image  (a)  equals 
the  second  moment  statistics  taken  over  al 1 
resolution  cells  marked  in  image  (b) . 


Zr  - jO , 1 , . . . , N^-lj  bo  the  set  of  row  indexes  for  digital 

image  I 


Zc  = |0  r 1 f ' ■ * « Nq- "Ij  ke  t^1°  set  °f  column  indexes  for 

digital  image  I. 

Suppose  N^_  is  an  integer  multiple  of  Kr  and  Nc  is  an  integer 

multiple  of  K . Let 
c 


R ( i , j ) = •’  ( a , b ) e ( Z xZ) 

r c 


a modulo  K = i 
r 


and  b modulo  K = i 
c 

S (i, j ,n,  ,n0)  = ([ (a,b)  , (c,d) ] t (Z  x Z ) x (Z  x Z ) 

- l r C IT  C 

c = a + n^,  d = b + n and  (a , b)  eR  ( i , j )j 
An  image  I:  Zf  x Zc  -►  G is  stationary  if  and  only  if 


(1)  U = | R ^ ^ 2-u  I(a,b)  for  each  (i,j 

(a  ,b)  eR  (i , j ) 


( 2 ) (n,/n«)  ji  q / . "T  ii  rr  r 


2 (I(a, 


b ) — u ) (I,  (c,d)-u1 


1'  2'  [ ( a , b ) , (c,d) ]cs(i, j,nlfn2) 


for  each  (i,j) 


An  image  I:  Zr  x Zc  -*  G is  isotropic  if  and  only  if  I is 

stationary  and 

2 2 2 2 

o (n1n2)  = a(mlfm2)  whenever  n1  + n2  = mi  + m2  ' 

Consider  now  how  we  could  compute  the  covariance  matrix 
for  an  image  which  satisfies  or  almost  satisfies  the  station- 
arity  or  isotropicity  conditions.  Shown  in  Figure  2.5  is  a 
16  x 16  image.  We  will  partition  it  into  4x4  subimages 
(Figure  2.6).  Figure  2.7  shows  the  general  form  of  the  auto- 
covariance matrix  when  no  stationarity  or  isotropicity 
assumptions  are  made.  To  determine  the  covariance  matrix  with 
stationarity  assumed,  we  generate  a tableau  which  depicts  all 
pairs  of  resolution  cells  situated  in  the  same  translational 
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Figure  2.6.  A K 


r x Kc  subimage  of  the  original  image 


Figure  ^.5. 


Illustrates  a 16  x 16  image  partitioned 
into  4x4  subimages. 
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Figure  2.7. 


Ilustrates  that  the  auto-covariance  matrix  for 
non-stationary  image  partitioned  into  4 x 4 
ubimages,  has  136  distinct  entries,  bach  distinct 
ntrv  is  labeled  with  a two  character  label. 


1 


relationship  [see  Table  2.1].  From  this  table  the  stationary 
autocovariance  may  be  formed,  Figure  2.8.  If  the  tableau  is 
modified  by  the  definition  of  isotropicity  we  can  generate  a 
new  table  and  subsequent  covariance  array,  Table  2.2,  and 
Figure  2.9.  The  isotropic  assumption  basically  establishes 
equivalence  classes  of  the  lettered  columns  in  Table  2.1  in 
the  following  manner: 


'a, 

g) 

fx, 

r} 

s} 

u. 

i, 

P,  j} 

{v, 

d) 

{f , 

n, 

b,  h) 

{w, 

k} 

{2, 

u, 

c,  o} 

{m. 

i) 

4.  Symmetry  Properties 

of 

a Covariance  Matrix  of  an  Isotropic 

Image 

A covariance 

! matrix 

of 

an  isotropic  image  has  symmetry 

properties 

which 

permit 

the 

rapid  computation  of  its  eigen- 

vectors.  ; 

Such  a 

covariance 

matrix  can  always  be  partitioned 

2 

into  (Kc  x K ) submatrices,  each  submatrix  being  of  the 

same  toeplitz  form.  Each  such  toeplitz  submatrix  has  only  Kc 
distinct  entries  as  shown  in  Figure  2.10.  The  number  of  times 
each  distinct  entry  occurs  is  given  by: 

Entry  name  Number  of  times  entry  occurs 


V1 

"I 

II 

71 

O 

V2 

n2 

= 2nx  - 

2 

V3 

n3 

(N 

1 

cm 

C 

II 

• < 

n4 

= n^-2 

-1 

c 

nK 

c 

-!  = nK 

c 

-2~2 

< 

71 

O 

nK 

c 

nK  -1 

-2  = 2 

c 


The  total  number  of  distinct  submatrices  in  the  partition 
of  the  covariance  matrix  is  Kr.  The  submatrices  themselves  are 
organized  as  a toeplitz  form.  As  shown  in  Figure  2.11  the 
number  of  times  each  submatrix  is  repeated  in  the  covariance 
matrix  is  given  by: 
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(1 , 2)  - (1 , 3) 

( 4 , 1 ) - (4 , 2 ) 

(1 . 2 ) - (1 , 2) 

( 3 , 2 ) - (3 . 2 ) 

(3,4)-  (4, 1) 

(3 , 1) - ( 3 

.41 

(2 , 1)  - (2 . 3) 

(1,3)-  (1,4) 

( 4 , 2 ) - { 4 , 3 ) 

(1,3)- (1,3) 

(3,3)— (3,3) 

(4  , 1)-  (4 

,4) 

(2.2)- (2,4) 

( 2 . 1 ) - ( 2 . 2 ) 

( 4 , 3)  - (4 , 4 ) 

( 1 , 4 J - (1 , 4 ) 

( 3 , 4 ) - ( 3 , 4 J 

( 3 , 1 ) - ( 3 , 3 ) 

(2 , 2)  - (2 . 3) 

( 2 , 1 ) - ( 2 , 1 ) 

( 4 , 1 ) - ( 4 , 1 ) 

( 3 , 2 ) - ( 3 , 4 ) 

( 2 , 3 ) - ( 2 , 4 ) 

...  -(2,2) 

(4, 2) -(4. 2) 

( 4 , 1 ) - ( 4 , 3 ) 

(3 . 1)  - (3 , 2) 

( 2 . 3 ) - ( 2 , 3 ) 

(4,  3) -(4.  3) 

( 4 , 2 ) - ( 4 , 4 ) 

(3 , 2 ) - ( 3 , 3) 

(2,4)— (2,4) 

(4,4)— (4,4) 

Table  2.1.  Lists  in  each  column  all  pairs  of  resolution  cells 
situated  in  the  same  translation  relationship.  For 
example,  all  the  resolution  cell  pairs  listed  under 
column  c are  related  by  three  rows  down  and  one  row 
across . 


Figure  2.8  Form  of  auto-covariance  with  stationary 
assumption. 
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Table  2.2.  Lists  in  each  column  all  pairs  of  resolution  cells 
situated  in  the  same  distance  relationship.  For 
example,  all  the  resolution  cell  pairs  listed  under 
column  c are  related  by  distance  = ^10. 
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Number  of  times  submatrix  occurs 
m^  = K 
m2  = ^m^-2 
m3  = m2~^ 
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-2  = 2 


Figure  2.12  shows  the  form  of  a covariance  matrix  for  an 
isotropic  image  partitioned  into  3 row  by  4 column  subimages. 
With  all  this  symmetry,  we  certainly  should  expect  that  the 
calculation  of  the  eigenvectors  of  this  matrix  involves  less 
work  than  we  would  need  to  do  for  . a general  covariance  matrix. 
Perhaps  if  we  are  lucky,  the  transformation  defined  by  the 
eigenvectors  may  even  have  a fast  implementation. 

5 . Theory  of  Composi te  Matrices 

The  fast  transform  question  is  how  close  we  can  find  a 
composite  matrix  to  the  general  form  of  a autocovariance 
matrix.  If  we  can  find  a composite  matrix  similar  in  form  to 
the  autocovariance  matrix,  then  we  would  expect  that  the  easily 
computed  eigenvectors  of  the  composite  form  would  be  similar 
to  the  eigenvectors  of  the  autocovariance  matrix  and  that  the 
transformation  of  the  imaqe  by  these  easily  computed  eigen- 
vector (which  have  a direct  product  form)  can  be  implemented 
as  a fast  transform. 

Our  situation  is  a fortunate  one  because  the  theory  of 
composite  matrices  says  that,  if  a matrix  A can  be  partitioned 
into  submatrices  which  have  the  same  set  of  eigenvectors,  then 
the  eigenvectors  of  A can  be  formed  as  the  direct  product  of 
the  eigenvectors  of  the  submatrices  with  the  eigenvectors  of 
the  matrix  of  corresponding  eigenvalues.  Hence  if: 


4 9 


I I 


Figure  2.11.  Illustrates  how  the  covariance  matrix  of  an 
Isotropic  image  can  be  partitioned  into  submatrices  each 
of  the  toeplitz  form  shown  in  Figure  2.9. 


Figure  2.10.  Illustrates  a SxS  submatrix  of  the  covariance 
matrix  of  an  isotropic  image.  The  submatiix  is  toeplitz 
and  has  only  five  distinct  entries:  v^,  v^,  v^,  v^,  v&. 


Figure  2.12.  shows  the  form  of  a covariance  matrix  for  an 
isotropic  image  partitioned  into  3 row  by  4 column  subimaqes. 
With  all  this  symmetry,  we  certainly  should  expect  that  the 
calculation  ol  the  eigenvectors  of  this  matrix  involves  less 
work  than  wc  would  need  to  do  for  a general  covariance  matrix. 
Perhaps  if  we  are  lucky,  tha  transformation  defined  by  the 
eigenvectors  may  even  have  a fast  implement 


Figure  2.12.  Isotropic  auto-cov.ir  i ance  form  when  an 
isotropic  image  is  partitioned  into  3x4  subimages 
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A = 


A8  A9  J 


and  v is  an  eigenvector 
poinding  eigenvalue 
matrix 


of  each  submatrix  A^  with  corres- 
and  u is  an  eigenvector  of  the 


with  corresponding  eigenvalue  n then:  the  direct  product* 

uv  is  an  eigenvector  of  A with  corresponding  eigenvalue  n.  [4] 

The  covariance  matrix  for  the  isotropic  image  almost  has  the 
required  property  of  the  composite  matrix.  The  covariance 
matrix  can  be  partitioned  into  submatrices  all  of  the  same 
toeplitz  form.  However,  this  is  not  a guarantee  that  the  sub- 
matrices have  the  same  eigenvectors. 

Empirical  work  with  the  submatrices  of  the  covariance 
matrix  of  an  isotropic  image  indicates  that  the  submatrices 
are  almost  multiples  of  one  another  and  therefore  have  almost 
the  same  eigenvector  set.  This  justifies  the  following  heuristic. 
Generate  a submatrix  with  the  property  that  the  normed  squares 
of  the  matrix  of  the  differences  between  the  best  multiple  of 
it  and  any  given  submatrix  of  the  covariance  matrix  is  the 
smallest  when  averaged  over  all  submatrices.  Then  replace 
each  submatrix  with  the  best  multiple  of  the  generated  sub- 
matrix. This  creates  a covariance  matrix  having  the  required 
composite  structure.  The  eigenvectors  can  be  determined 


* If  u'  = (u^,...,u  ) and  v'  = (v^,...,vm>, 
product 

(uv)  ' = (u.  v,  ,u.  v , . . . ,u  v ,u.,v  , . . . ,u_v  , . . 
iilz  lmzz  zm 


then  the  direct 


,u  v ,u  v . . . , u v ) 
n 1 n 2.  n m 
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quickly  and  the  transformation  defined  by  the  eigenvectors 
has  a fast  implementation. 

To  demonstrate  this  approximation,  consider  the  Isotropic 
Auto-covariance  matrix  in  Figure  2.12.  If  we  generate  the 
second  moment  matrix  considering  each  submatrix  as  a vector 
the  following  form  emerges: 
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Figure  2.13.  Form  of  the  second  moment  matrix  derived  from 
the  isotropic  covariance  matrix  with  each  submatrix  as  a 
vector.  All  entries  labeled  q,  for  example,  are  the  sums  of 
products  of  entries  labeled  i and  g,  respectively,  of  the 
matrix  in  Figure  11. 


Notice  that  all  the  distinct  variables  are  in  the  first  sub- 
matrix. It  is  therefore,  possible  to  find  the  eigenvector 
having  largest  eigenvalue  of  this  matrix  without  forming  the 
entire  matrix.  All  that  is  necessary  is  to  store  the  first 
submatrix  of  the  second  moment  matrix. 

Let  v'  = (v^  . . . ,Vg)  be  the  eigenvector  of  ^ ^ having 
largest  eigenvalue  A.  Then  2L1  v = Av.  The  first  row  of 
the  matrix  equation  is: 

PV1  + qV2  + rv3  + qv4  + pv  5 + <3v6  + rv7  + ^v8  + pv9  = vl 
and  the  fifth  row  is 

pvl  + qV2  + rv3  + qv4  + Pv5  + clv5  + rv7  + qv8  + pv9  = v5 
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Obviously,  the  first  row  and  the  fifth  row  are  equal  so 
= v^.  In  like  manner,  the  other  rows  may  be  compared, 
resulting  in: 


3pv^  + 4qv£  + 2rv  -j=  ' v 
3qVjMtV2  + 2sv^  = * v^ 
2rv^  + 4sv^  + 2uv  = X 


Separating  these  simultaneous  equations  we  have 
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This  always  happens  because  in  the  general  case  we  have  only 
Kc  independent  equations,  all  of  which  make  up  the  first  sub- 
matrix, of  the  second  moment  matrix,  derived  fron  an  isotropic 
covariance  array  in  this  manner.  We  need  not  worry  about 
which  elements  are  equal  as  long  as  our  index  follows  the 
toeplitz  form.  By  using  the  resulting  eigenvector  and 
symmetry  the  appropriate  multiples  for  each  submatrix  of  the 
auto-covariance  matrix  can  be  determined.  The  coefficients 
found  by  this  procedure  may  be  illustrated  by: 
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can  then  be  immediately  formed  as: 


The  composite  matrix 
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The  composite  matrix  resulting  from  this  operation  has  several 
properties  which  should  be  recognized:  First,  the  sub-matrices 

of  the  composite  matrix  all  have  the  same  eigenvectors  (they 
commute).  Second,  each  submatrix  is  of  the  same  toeplitz 
form.  Third,  corresponding  to  the  shared  eigenvectors  V. 
is  the  lambda  matrix  of  corresponding  eigenvalues, 


/ ai 

A _ 

A ^ 

A \ 

1 

2 

3 

\ 

/vi 

v2 

vs\ 

A _ 

A, 

A^ 

A 

5 

A _ 

6 

A, 

7 

A 

8 

i 

A,  being  the 

V2 

V1 

v2 

9 

10 

11 

12 

L c . 

v2 

Vv 

\ 

A14 

A15 

eigenvalue  of  1 

V 

\ 3 

13 

A 16 

under  the  eigenvector  V^.  Obviously,  for  each  the  lambda 
matrices  differ  only  by  a multiplicative  constant  and  so  also 
share  the  same  eigenvectors.  The  direct  produot  of  these  two 
sets  of  eigenvectors  are  the  eigenvectors  of  /j  , . Since  the 
eigenvectors  of  c can  represented  by  the  direct  product 
of  two  sets  of  vectors,  the  eigenvector  transformation  has  a 
fast  implementation.  (See  Figures  2.14  and  2.15) 

Once  the  fast  implementation  has  been  defined  by  the 
direct  product  relation  of  the  shared  eigenvectors  of  the 
submatrices  and  the  shared  eigenvectors  of  the  lambda  matrices 
we  have  created  a set  of  basis  vectors  which  may  be  used  to 
define  an  orthogonal  transformation  on  an  image.  (See  Figures 
2.16  and  2.17). 


6 . Computational  Results 

In  terms  relative  to  the  transform  coding  of  t..e  image, 
what  we  have  essentially  done  is  to  see  under  what  conditions 
the  Karhunen-Loeve  transform  is  separable  so  that  each  sub- 
image can  be  transformed  by  first  operating  on  its  rows  and 
then  by  columns.  We  determined  that  in  order  for  this  to 
happen  the  image  had  to  be  isotropic  and  we  had  to  approxi- 
mate t*c  submatrices  of  the  isotropic  covariance  matrix  with 
submatrices  having  the  same  eigenvectors.  In  order  to  deter- 
mine if  the  image  isotropicity  assumption  and  stationarity 
approximations  were  good  ones  for  image  data,  we  ran  some 
experiments  comparing  the  fast  approximate  K-L  transform  with 
the  standard  K-L  transform.  We  found,  as  described  in  the 
following  discussion,  that  the  fast  approximate  K-L  transform 
gives  results  better  than  the  Hadamard,  Fourier,  and 
Slant.  A 512  by  512,  6 bit  digital  image  was  compressed  at 
ratios  of  2 bits/pel,  1 bit/pel,  and  .5  bits/pel.  A compari- 
son was  made  between  the  Standard  K-L  transform,  the  Fast 
K-L  transform,  the  Discrete  Linear  Basis,  the  Slant  trans- 
form and  the  Discrete  Cosine  transform.  The  error  criteria 


Figure  2.14.  Kach  subimage  has  three  rows  by  four  columns. 

The  fast  implementation  transforms  each  of  th 
four  rows  of  the  sub  imago  first  and  then  tran 
forma  each  of  the  throe  resulting  columns. 
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Figure  2.16.  Row-column  operation  with  basis  vectors. 
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orthogonal  raw  trans 
formation  basis 
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orthogonal  column 
transformation  basis 
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figure  2.17.  Basis  vector  sets  derived  from  composite 
matrix  for  direct  product  implementation. 
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* 
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chosen  was  the  RMS  and  RMS  correlated  measures  (Haralick  and 
Shanmugam,  1974).  [5] 

The  first  error  investigated  was  that  of  the  stationarity 
and  isotropic  assumptions.  The  matrix  norm  of  the  differ- 
ences between  the  various  combinations  are: 

Auto-Covariance  - Stationary  Covariance  = 5.7628 

Stationary  Covariance  - Isotropic  Covariance  = 5.5923 

Auto-Covariance  - Isotropic  Covariance  = 4.4510 

The  RMS  errors  should  be  compared  to  401.765,  the  average 
value  of  the  diagonal  elements  of  the  covariance  matrices. 

If  we  let  A,  S,  L represent  the  auto-covariance,  of  the 
original  image,  the  stationary  covariance,  and  isotropic 
covariance,  respectively,  then: 

(A-L)  = (A-S)  - (L-S) 

IIA-LII  < IIA-SII  + IIL-SII 

and  our  results  satisfy  the  triangular  inequality.  In  terms 
of  distance  measure  Figure  2.18  depicts  the  differences  of 
these  approximations.  The  percentage  difference,  from  the 
average  variance  of  the  auto-covariance,  is  1.434%  for  the 
stationary  assumption  and  1.130%  for  the  isotropic  assump- 
tion. Examples  of  the  actual  auto-covariance  matrix  and  auto- 
covariance matrices  under  the  stationarity  and  isotropicity 
assumption  are  included  in  Figures  2.19,  2.20  and  2.21, 
respectively. 

The  eigenvectors  of  the  standard  auto-covariance  matrix 
do  not,  in  general,  have  the  discrete  sequency  property  which 
other  fast  transforms  have.  The  eigenvectors  resulting  from 
the  fast  K-L  do  have  the  sequency  property  as  the  other  fast 
transforms  have.  The  eigenvectors  for  both  the  standard  K-L 
and  Fast  K-L  are  compared  in  Figures  2.22  and  2.23.  The 
subimage  size  used  was  4x4  only  because  of  minicomputer 


59 


X 


LO 

Cd 

o 

CD 

T3 

3 

cT'  E 
2 S' 

O ^ 


_ w 

O <1) 
IA  C 0 
cvj  ru  C 

. fT3 

O '-•- 

„ H3  l- 

V)  ro 

o 


I 


a>  — 
> id 


o < < 


f.O 


Figure  2.18.  Illustrates  the  distance  between  the  actual 
auto-covariance  matrix  derived  from  original 
image  (upper  triangle),  the  covariance  matrix 
assuming  image  stationarity  and  the  covariance 
matrix  assuming  image  isotropicity . 


Figure  2.20.  Auto-covariance  matrix  derived  using 
stationary  assumption. 


Stondard  K-L 


Fo*t  K-L 


Figure  2.22.  Comparison  of  eigenvectors  for  Standard  K-L 
and  Fast  K-L  in  order  of  sequency. 
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figure  2.23.  Comparison  of  eigenvectors  for  Standard  K-L 
and  Fast  K-L  in  order  of  sequency. 


memory  and  computational  restrictions  on  determining  the 
standard  K-L.  The  fast  version  can  use  much  larger  subimages 
because  of  the  symmetry  and  storage  savings. 

Results  of  the  compression  at  2 bits/pel,  1 bit/pel,  and 
.5  bits/pel  are  plotted  for  comparison  in  Figure  2.24.  It 
is  apparent  that  the  Fast  K-L  out-performs  the  other  trans- 
forms and  is  closest  to  the  optimum,  differing  from  the 
optimum  K-L  only  by  approximately  1%.  Table  2.3  gives  a 
comparison  of  these  errors. 

7 _ Conclusions 

In  terms  relative  to  the  transform  coding  of  an  image  we 
have  presented  the  conditions  under  which  the  Karhunen-Loeve 
transform  is  separable  and  developed  on  approximate  fast  K-L 
transform.  We  have  discussed  the  approximations  of  station- 
arity  and  isotropicity  in  terms  of  the  L2  distance  norms. 

Our  results  indicate  that  the  fast  K-L  transform  is  comparable 
to  other  discrete  transforms  both  in  terms  of  sequency  and 
compression  performance.  Experimental  data  indicates  that 
the  fast  K-L  transform  is  closest  to  the  optimum,  differing 
from  the  optimum  K-L  by  approximately  1%. 
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Figure  2.24a.  Comparison  of  RMS  error  as  a function  of 
compression  ratio  for  K-L,  Fast  K-L, 
Slant  and  DBL. 
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Figure  2.24b.  Comparison  of  percentage  error  as  a function 
of  compression  ratio  for  Fast  K-L,  Slant 
and  DBL. 


TABLE  2.3 


TABLE  OF  RMS 

ERROR 

TRANSFORM 

COMPRESSION  RATIO 

2.0  bits/pel 

1.0  bits/pel 

.5  bits/pel 

Standard  K-L 

2.9071 

4.1855 

4.9461 

Fast  K-L 

2.9362 

4.2098 

5.0094 

DLB 

2.9492 

4.2819 

5.1993 

DCT 

3.0454 

4.3288 

5.0418 

Slant 

2.9492 

4.2819 

5.1993 

*_ 

ERROR  COMPARED  TO 

STANDARD  K-L 

TRANSFORM 

COMPRESSION  RATIO 

2.0  bits/pel 

1.0  bits/pel 

.5  bits /pel 

Fast  K-L 

1.0000 

.580 

1.279 

DLB 

1.448 

2.303 

5.119 

Slant 

1.448 

2.303 

5.119 

DCT 

4.754 

3.423% 

1.934% 
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SECTION  III 

ENCODING  FOR  COMPRESSION 
1 . INTRODUCTION 

In  Chapter  1,  the  concept  of  the  fast  transforms  wern 
discussed  and  the  criteria  for  compression  of  an  image  w ; 
given.  Chapter  2 discussed  under  what  conditions  a fast 
optimum  transform  existed.  In  this  chapter  the  quantiza* 
and  coding  of  the  transformed  coefficients  is  viewed  from 
three  categories.  These  categories  are  compression  by 
sampling,  compression  by  energy  thresholding,  and  an  optj 
bit  encoding  scheme.  This  material  is  amplified  by  the 
suggestion  of  preprocessing.  If  information  about  the  im.'<ie, 
or  class  of  images,  statistics  is  known  the  image  quality 
may  be  improved  in  terms  of  the  visual  criteria.  Sampling 
techniques  may  be  divided  into  two  categories.  those  which 
are  based  on  the  unique  structure  of  energy  in  the  transform 
plane  and  those  which  are  related  to  the  spatial  distribut  n 
of  frequency/or  sequency  in  the  transform  domain.  Those 
techniques  based  on  energy  are  guaranteed  to  give  best  res  - L ‘ 
in  terms  of  RMS  error  and  they  appear  to  have  given  the  best 
results  visually  as  well. 

In  this  program  the  compression  requirements  are  more 
severe  than  what  has  been  reported  in  the  literature.  The 
compression  in  the  transform  domain  must  approach  .6  and 
.7  bits/pel.  (10:1  component  compression).  Of  the  methods 
reported  in  the  literature  the  following  are  most  pre 
dominant:  fl] 

a.  Checkerboard  sampling 

b.  Random  sampling 

c.  Zonal  sampling 

d.  Threshold  sampling 

To  satisfy  a given  error  criterion,  the  "best"  compression 
maximizes  the  number  of  components  set  to  zero  while 
minimizing  the  error. 
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2 . Statistical  Measures 

This  section  presents  a brief  discussion  of  several 
basic  statistical  measures  derived  from  natural  aerial 
imagery . 

Probability  Density  Functions 

In  order  to  develop  the  marginal  and  joint  probability 

density  functions  used  consider  the  following: 

Let  Z = {1,...N  } be  the  set  of  row  indexes  for  the 
x x 

digital  image  I; 

Z = { 1 , . . . N } be  the  set  of  column  indexes  for  the 

y y 

digital  image  I. 

I:  Z xZ  •>  S 

x y 

where  S is  the  set  of  digitized  grey  levels 


Then  the 
P(s.) 


marginal  probability  is  given  by 


#{  (n ,m) c (Z^xZy ) |l(n,m) 
#Z  xZ 

x y 


S.  } 
1 


Let  R be  a binary  relation  defining  the  specified 
spatial  relationship  of  any  two  resolution  cells. 

RC(Z  xZ  ) x (Z  xZ  ) defined  by 
— x y x y 2 

R - { ( (a,b) , (c,d) | p ( (a,b) , (c,d) ) = r} 


where  p is  a metric  on  Z xZ  . 

x y 

The  joint  probability  density  function  may  then  be  defined  by: 


#1  ((a,b)  , (c,d))tR|l(a,b)®s.  , I (c ,d) =s  . } 


P(si,sJ  = 


#R 


1 


Scene  probability  density  functions  are  useful  in  selecting 
quantizer  assignments  in  approaches  which  use  additional 
quantizing  assignments,  linear  or  non-linear,  prior  to  coding. 
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These  histograms,  of  a two-dimensional  image,  indicate  how 
much  of  the  dynamic  range  of  gray  levels  is  used  and  give 
an  estimate  of  the  distribution.  Ten  images  were  received 
from  the  Air  Force  and  had  been  digitzed  to  6 bits  (64  gray 
levels) . A few  of  these  images  were  selected  at  random  and 
joint  histograms  made.  This  class  of  images  were  all  military 
vehicles  in  natural  settings  and  partial  sky  in  the  image. 
These  estimates  of  the  distributions  are  illustrated  in 
Figures  3.1,  3.2,  3.3.  In  general,  the  relative  joint  dis- 
tribution is  a function  of  the  distance  between  the  grey 
tones.  The  high  correlation  in  the  images  tested  here  show 
that  a range  of  2 and  5 with  a circular  pattern  does  not 
change  the  joint  density  aistribution  significantly  (See 
Figures  3.4,  3.5).  These  statistics  suggest  that  perhaps  a 
better  cosmetic  quality  could  be  achieved  by  requantizing 
the  image  prior  to  transforming,  and  compression.  Indeed, 
this  was  verified  as  will  be  seen  in  Section  3.2  by  the  use 
of  equal  probability  quantizing. 

Equal  probability  quantization  is  a procedure  by  which 
the  average  information  content  of  the  quantized  distribution 
is  maximum.  If  we  define  q^,  q^/.-.qj  as  the  quantized  levels 
and  the  probabilities  p^  p2,...,p  associated  with  these 
quantization  levels  then  the  average  information  is  defined 
to  be : j 

H = - E P.  log  P. 
i=l  1 1 

H will  be  maximum  when  the  quantization  levels  are 
selected  with  equal  probabilities.  In  order  to  determine 
these  quantization  levels  the  relative  frequency  function 
is  determined  for  the  gray  levels  of  the  pixels  in  the  image. 
The  cumulative  frequency  distribution  is  then  formed  from 
the  relative  frequency  function  and  the  probability  range 
is  divided  into  equal  increments  as  a function  of  the  gray 
levels . 
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Figure  3.1.  Joint  Probability  Distributions 
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Figure 


Probability  Distribut 


Joint  Probability  Distribution-spa 


3 . Compression  by  Sampling 

Sampling  techniques  may  be  divided  into  those  which 
depend  on  the  structure  of  energy  and  those  depending  on 
frequency  of  the  component  position.  In  some  cases  the  two 
may  be  closely  related  but  in  general  we  cannot  depend  on 
this.  However,  if  we  are  willing  to  make  this  assumption 
a great  advantage  is  achieved  by  the  ease  of  implementation. 
A notch  filter  (sampling)  approach  was,  therefore,  used  to 
investigate  the  quality  of  the  reconstructed  image  after 
compression. 

Consider  the  transformed  frequency  space  resulting  from 
one  of  the  transforms  mentioned  previously.  The  low  fre- 
quencie  of  the  transform  domain  contains  the  largest  per- 
cent of  energy.  If  an  original  image  f(x,y)  has  magnitude 
ranging  in  unit  steps  from  0 to  A then  the  maximum  magnitude 

of  a transform  sample  will  be  AN  and  the  minimum  non-zero 
2 

magnitude  1/N  , where  N is  the  number  of  samples.  [1]  Since 
the  dynamic  range  is  larger  in  the  frequency  domain  only  a 
few  points  can  take  on  large  values.  This  is  because  by 
Parseval's  relation  the  energy  in  the  spatial  and  transform 
domain  are  equal.  A "notch"  filter  may  then  be  constructed 
by  transforming  into  the  frequency  domain  and  deleting  fre- 
quencies in  a specified  range.  The  transmitted  samples  is 
the  set  of  all  remaining  non-deleted  frequencies.  The  fre- 
quencies to  keep  can  be  approximated  by  analysis  of  the 
smallest  dimension  which  must  be  retained  in  the  image.  If 
this  yields  a recognizable  target  then  frequencies  can  be 
eliminated  and/or  high  frequencies  emphasized.  Figure  3.6 
shows  typical  patterns  used  in  the  transform  compression  by 
"notch"  filtering. 

A set  of  three  images  were  processed  using  this  "notch 
filter"  or  sampling  approach.  The  three  originals  had  been 
digitized  by  the  Avionics  Laboratory,  linearly  to  be  64 
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Figure  3.6.  Notch  Filtering  Patterns 


discrete  levels.  (See  Figure  3.7)  Figure  3.8  illustrates 
the  sequence  of  operations. 

The  fast  Fourier  was  then  utilized  in  compression.  The 
first  stage  of  this  process  was  to  investigate  which  com- 
ponents would  be  deleted  and  what  effect  this  deletion  would 
have  on  the  image.  This  was  done  without  regard  to  blocking 
and  without  the  log  of  brightness.  The  first  notch  filter 
used  was  that  labeled  as  inner,  outer  range  of  (4,9)  . The 
components  within  the  range  of  4 and  9 being  set  to  zero  on 
each  16  by  16  block.  This  was  done  also  for  range  (3,9) , 

(4,  12),  (3,  12).  The  corresponding  compression  ratio's 

are  given  below: 

Notch  Filter  Range  Component  Compression 


4, 

9 

4:1 

3, 

9 

5:1 

4, 

12 

7 : 1 

3, 

12 

12:1 

Figure  3.9  shows  the  corresponding  notch  patterns. 

The  resulting  images  indicate  that  for  higher  component 
compression,  the  low  frequency  terms  up  to  range  4 must  be 
maintained.  With  a range  of  three,  significant  loss  in  re- 
solution was  noticed.  The  image  comparison  for  these  ranges 
are  shown  in  Figure  3.10.  The  corresponding  RMS  error  was 
also  higher  for  low  inner  distance  range  of  3.  This  is 
because  of  a large  change  in  energy.  It  was,  therefore, 
decided  to  use  notch  filter  with  inner/outer  range  of  (4,  9) 
and  (4,  12).  This  produces  better  resolution.  The  (4,  12) 
notch  deletes  all  frequencies  beyond  inner  range  4.  This 
did  not  degrade  resolution  when  compared  to  (4,  9) . The 
images  were  then  compressed  also  using  the  log  of  the  bright- 
ness and  as  expected  and  previously  reported  in  the  litera- 
ture, the  image  at  both  4:1  and  7:1  did  appear  to  be  better 
with  respect  to  visual  quality. 
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Original  Images  from  Wright  Patterson  Air 
Force  Base  Digital  Tape 


4 


I | 

♦ \*i  •>  if  * >>  * > Jr  a(c  * * Ji  # ,1  v* 


* } 


O o O O ) 


> > o . > 


O O O O ) *>  ‘ - > O ! ■)  ' « 


^ *-•  o o o o o :>  o o j o o o o 


> « 

o r> 


« -*  o o o o o _>  o o - > o o j o - « 

i I I ! ! I 


o 

o 

O 

• 

o 

3 

o 

o 

• 

o 

• 

o 

o 

o 

o 

o 

4 

o 

o 

• 

J 

o 

o 

o 

o 

Q 

■* 

^ ) 

4 

o 

4 

■J 

o 

4 

o 

4 

3 

G 

4 

°l 

o 

« 

O 

„ 

4 

• 

■a 

„ 

# 

• 

• 

„ 

4 

a 

4 

4 

•> 

« 

4 

• 

• 

4 

• 

o 

4 

4 

e 

4 

4 

« 

o 

o 

o 

o 

o 

o 

o 

O 

o 

o 

o 

O 

o 

o 

o 

o 

o 

o 

o 

o 

o 

V 

o 

o 

o 

o 

o 

G 

ol 

o 

4 

• 

• 

« 

o 

4 

4 

4 

a 

4 

• 

4 

«» 

4 

4 

n 

4 

o 

o 

o 

o 

o 

o 

o 

o 

-* 

o 

O 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

G 

o 

o 

o 

o 

o 

■» 

« 

, 

• 

« 

• 

C) 

* 

o 

« 

o 

4 

4 

4 

4 

# 

• 

a 

# 

* 

o 

o 

• 

4 

4 

4 

4 

4 

4 

4 

4 

o 

o 

o 

o 

o 

o 

o 

f— 3 

r-  1 

o 

o 

o 

o 

o 

C.J1 

o 

o 

ci 

G 

o 

o 

° 

G 

o 

o 

o 

o 

o 

CJ 

o, 

o 

o 

c • 

( 

o 

o 

> 

o 

o 

o 

l 

• 

o 

* 

4 

• 

4 

• 

4 

■9 

a 

4 

> 

• 

•n 

4 

4 

4 

4 

4 

4 

4 

•» 

• 

• 

•» 

• 

O 

4 

4 

■* 

c 

■r 

t ■> 

r 

‘'O 

‘ • 

f 1 
' 

c ^ 

c 

o 

' 5 

r> 

. J 

CO 

ol 

1 

o 

o 

o 

o o 

o 

o 

1 

c • 

*.  ■ 

o 

o 

4 

•» 

» 

© 

•> 

» 

A 

a 

*» 

4 

0 

4 

> 

-J 

•# 

4 

o 

•» 

* 

• 

•» 

4 

« 

4 

a 

4 

o 

4 

4 

; 

c 

o 

o 

r - 

-4 

r 4 

■ 1 

a-. 

>- 1 

r I 

o 

o 

c 

o 

o 

o 

o 

c 

o 

o 

G 

t_. 

o 

o 

c. 

o 

Q ) 

o 

o 

o 

o 

o 

* 

• 

• 

• 

o 

o 

• 

o 

t 

C ) 

o 

o 

•a 

• 

• 

4 

rg 

4 

4 

• 

• 

• 

a 

4 

4 

4 

4 

4 

4 

« 

A 

•1  1 

o 

o 

o 

o 

o 

r-  1 

r-  1 

'-4 

o 

Q 

o 

o 

-J 

c 

o 

o 

' 

o 

o 

o 

CJ 

o 

G 

G 

o 

o 

o 

o o| 

o 

o 

o 

II 

• 

• 

• 

« 

• 

o 

• 

o 

o 

4 

4 

• 

4 

a 

LJ 

4 

• 

• 

• 

4 

4 

# 

. 

4 

4 

4 

4 

4 

a 

•* 

1 

a 

o 

o 

o 

o 

o 

o 

O 

-• 

c 

o 

G 

o 

o 

o 

c> 

c 

o 

o 

c 

o 

t 

c 

o 

o 

c_. 

o 

> ' 

° 

< 

o 

o 

to 

o 

o 

o 

o 

O 

o 

o 

o 

o 

o 

o 

o 

G 

o 

o 

o 

o 

o 

0. 

CJ 

o 

o 

o 

« .) 

o 

c 

o 

G 

o 

o 

o 

o; 

a: 

1 

o 

UJ 

»— 

% 

4 

• 

• 

• 

• 

• 

„ 

• 

« 

4 

« 

4 

4 

* 

“> 

4 

• 

4 

» 

4 

4 

4 

4 

4 

« 

4 

4 

4 

4 

4 

4 

o 

o 

o 

O 

o 

o 

o 

o 

*"4 

O 

o 

o 

( J 

o 

o 

o 

o 

o 

o 

o 

o 

o 

c 

o 

G 

3 

o 

o 

1 

CO 

- 

o 

o 

o 

o 

o 

u 

o 

G 

o 

o 

o 

o 

(.* 

o 

o 

o 

o 

o 

G 

o 

o 

o 

c 

o 

o 

o ; 

II 

o 

o 

i 

<3 

o 

• 

« 

4 

4 

llJ 

o 

4 

C ) 

4 

4 

4 

4 

4 

4 

4 

» 

# 

# 

— • 

u 

o 

o 

o 

o 

cj 

c ) 

o 

. 

o 

o 

( ) 

— 

' * 

o 

o 

o 

G 

o 

t 

o 

3 

G 

•“* , 

1 — 

o 

Q 

3 

■> 

o 

c 

to 

c > 

o 

— 

r-l 

o 

o 

o 

o 

o 

o 

o 

o 

o 

a 

^4 

r-4 

- 

o 

o 

o 

J 

o 

r~ 

o 

n : 

o 

o 

O 

< > 

, , 

Ui  o 

< 9 

o 

o 

08  2 

r'* 

o 

> 4 

. « 

r< 

< > 

o 

o 

O 

) 

l > 

u 

. . 

. * 

44 

— 

■« 

-• 

c » 

’ 

Flgure  3.9a.  Notch  Patterns  for  3,9  and 


AD-A039  761 


20F  3 

^8)39761 


KANSAS  UNIV/CENTER  F0«  RESEARCH  INC  LAWRENCE  F/0  1 

VIDEO  BANDWIDTH  COMPRESSION* (U) 

FEB  77  N C ORISWOLD.  R M HARALICK  F33615-7A-C-1093 

*87-9  AFAL-TR-76-102  NL 


i a i A .\  L £ 


> * < !»  > S * *«  * j»  v \t  ■ 4 | 


I I 

i i 


iny 

v ( 


•OOO  I C Ox)  o o 

I I I 


» O O j i > c 


o O O > .o 

O O,  ._>  •-  ^ 


i » - » «.  ) O » > O o O O O O 


r-4  ^»oooooooo  o o o *-«  r-i 


III  c 

> « : > o o > o o o o o «" > o r-4 ■ 


O,  o’o 

O O 


0 

01 


I o 

i O 


C ) O C>  O O O O r-«0000000 


o 

o 


•-  « -«  o o o o o o o o o o o o o ^ 


O O O O 0 0 0,0  O OOO  0.0  O Oj 


• 

x.  : 

« 

o 

3 

O 

• 

o 

• 

o 

•» 

o 

• 

o 

• 

r > 

• 

r-4 

• 

o 

o 

• 

o 

• 

o 

• 

o 

3 

o 

3 

o 

3 

o 

c» 

0. 

• 

o 

* 

o 

• 

o 

• 

o 

• 

o 

o 

• 

o 

• 

o 

0 

o 

• 

o 

• 

o 

• 

o 

•1 

o 

o 

o 

o 

O 

o 

o 

O 

o 

O 

o 

o 

o 

0. 

*-« 

«*H 

o 

o 

o 

o 

o 

° 

o 

o 

o 

o 

) 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

c 

o 

o 

o 

O 

o 

o 

o 

c ■ 

o 

O 

o 

o 

o 

«-• 

rH 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

c; 

O 

o 

o 

o 

o 

o' 

1 

C J 

o 

c; 

O 

o 

o 

o 

o 

o 

c 

o 

O 

o 

H 

r-4 

~ 

r 1 

-• 

, -4 

- 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

c 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

> 

o 

o 

o 

o 

o 

og 

c 

o 

(J 

o 

o 

o 

r-4 

M 

- 

-• 

o 

o 

o 

o 

o 

o 

r-) 

o 

V) 

o 

o 

0. 

o 

o 

o 

o 

o 

o 

o 

o 

° 

II 

o 

o 

o 

II 

o 

o 

c 

1J 

« 

• 

• 

• 

• 

• 

• 

• 

« 

0 

4* 

• 

• 

• 

• 

• 

tu 

• 

• 

« 

• 

• 

• 

3 

• 

• 

« 

• 

• 

• 

• 

• 

o 

c_ 

o 

O 

o 

o 

o 

o 

•** 

r-4 

o 

o 

o 

o 

o 

o 

o 

V, 

o 

o 

o 

C' 

C' 

o 

r; 

o 

o 

a 

o 

o 

c 

*C 

— 

o 

o 

CO 

— « 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

4.) 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

ir 

a: 

ij 

o 

UJ 

- 

o 

D 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

n 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

♦ 

J> 

o 

y 

o 

o 

o 

o 

^ » 

r-4 

o 

o 

o 

o 

o 

o 

o 

o 

o 

tj 

v.> 

o 

o 

o 

V.  ' 

o 

o 

o 

o 

o 

o 

o 

° 

o 

o 

n 

o 

■* 

o 

o 

o 

Cj 

o 

• 

• 

• 

• 

r 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

A 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

»•  1 

" * 

o 

o 

c? 

r > 

r-v 

o 

o 

o 

o 

o 

o 

o 

■ 

•-  • 

t ■) 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

ll  o > o 

O O O 


I i 


1 on*  ii  o « i o 
i o o . O « j 


C)  o 
O K. 


* O O O wOOOOOOO  O o o o o o o O O O O 


O ^ o o I 
O ’ o o 1 


1 i *i  I 

OOO'  »-  0^00 

O O O ! »/)C  o 


OOO’ 

o o c; 


.-j  ^ o O O O O O O O O rH  1 


r > O O 


• I o O O O 
/ O v > O O 

/ •*••••••• 

« f < m /h  m o o o f:  o O O O O m m 1 i«,h  r,  v>  c;  o cj  o < o o c 


a r « r-4  O O O O O 1 O O O -«  r-.  r t 

l 

oc  I 

lunn  c o 


O O O 
C'  O (' 


Figure  3.9b.  Notch  Patterns  for  4,9  and 


Figure  3.10.  Comparison  of  Component  Compression  for 
Notch  Filter 
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The  blocking  was  then  considered.  The  blocking  occurs 
due  to  edge  to  edge  brightness  differences  within  the  sub- 
images.  This  may  be  reduced  by  convolving  a 2 x 2 window 
over  the  total  image  or  by  averaging  the  first  and  last 
row,  column  of  the  adjacent  subimage.  A comparison  of  the 
truck  image  with  log  of  brightness  and  log  of  brightness 
with  a 2 x 2 convolved  window  is  shown  in  Figure  3.11. 

After  studying  the  images,  the  question  of  improving 
the  quality  of  the  notch  filter  approach  was  examined.  The 
(4,  9)  notch  (4:1  compression)  filter  was  reprocessed  but 
this  time  a high  frequency  weighing,  linear  proportioned  to 
the  magnitude  of  the  d.  c.  term,  was  used.  See  Figure  3.12. 
This  indicated  that  the  image  was,  in  fact,  much  improved 
in  terms  of  the  visual  conception  or  cosmetic  qualities. 

The  RMS  error  was  increased  but  this  points  out  that  by 
weighing  the  high  frequencies  the  RMS  error  can  be  just  as 
high  because  it  does  not  make  the  image  closer  to  the 
original  in  the  mean  square  sense  but  still  has  better  cos- 
metic qualities.  Therefore,  the  decisions  based  on  mean 
square  error  alone  cannot  be  a complete  description  of  how 
good  the  image  is. 

The  RMS  error  of  the  bridge  scene  was  calculated  for  the 
pre-emphasis  case  to  be  6.53379.  If  the  RMS  error  due  solely 
to  the  blocking  is  evaluated  it  is  found  to  be  4.72548. 

This  error  is  a measure  of  the  differences  just  on  the 
boarders  of  our  16  x 16  blocks  and  is  approximately  12% 
of  the  total  error.  This  indicates  with  blocking  removed 
the  error  would  be  1.808. 

Further  improvement  in  the  notch  filter  approach  was 
believed  possible  if  the  statistics  of  the  original  image 
is  considered.  From  the  relative  frequency  plots  of  Figures 
3.2,  3.3,  it  is  apparent  that  most  of  the  gray  levels  exist 
in  the  lower  gray  level  ranqe.  Equal  probability  quantizing 
was,  therefore,  utilized  to  reduce  the  data  from  6 bits  to 


Log  with  Convolution 


Log 


Fourier  Transform 


Figure  3.11.  Comparison  of  log  of  brightness  and  log  of 
brightness  with  convolution 


Fourier  Transform 


Lt  Figure  3.12.  High  Frequency  Emphasis 
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5 bits  to  yield  a 1.2  further  compression.  The  resultiny 
image  was  processed  through  the  "notch"  filter  (4,  9).  The 
compression  of  the  (4,  9)  notch  is  4:1  with  the  additional 
1.2  by  equal  probability  quantization  yielding  a 4.8:1  com- 
pression, the  results  of  this  process  is  shown  in  Figure 
3.13.  This  indicates  that  all  images,  if  used  with  equal 
probability  quantizing,  can  be  increased  in  compression  with- 
out noticable  degradation.  This  results  in  a compression 
of  8-10  depending  on  the  "notch"  which  is  used. 

The  Hadamard  and  Slant  transform  were  investigated 
next.  The  results  of  these  transforms  compressed  by  an 
L-notch  filter  is  shown  in  Figures  3.14,  3.15. 

An  example  of  the  L-notch  pattern  is  illustrated  in 
Figure  3.16.  The  truck  scene  was  selected  as  the  image  with 
most  activity  and  the  Discrete  Linear  basis  was  also  used 
in  component  comparison,  and  Table  3.1  and  Figure  3.17 
give  the  RMS  comparisons.  Since  the  DLB  was  the  best  in 
terms  of  RMS  error  it  was  compared  to  the  Karhunen-Loeve 
the  optimum  under  this  criteria  for  the  three  scenes. 

Table  3.2  compares  the  component  compression  and  error  while 
the  pictorial  results  are  indicated  in  Figures  3.18  and  3.19. 

4 . Compression  by  Energy  Thresholding 

By  investigating  the  energy  associated  with  each  fre- 
quency transform  component,  and  sorting  these  frequency 
components  according  to  that  value,  the  importance  of  each 
component  in  reconstruction  of  the  image  may  be  deduced.  The 
highest  energy  values  being  more  important  than  the  smaller 
values  provide  the  means  of  compression.  Almost  all  the 
energy  will  be  contained  in  a few  low  frequency  terms.  The 
definition  for  energy  component  compression  and  process  is 
found  by  the  following: 

The  energy  of  projections  corresponding  to  each  basis 
vector  used  in  a particular  fast  transform  is  computed. 


Figure  3.1 


3.  32  Level  - Equal  Probability  Quantizing 


H9 


2:1 


4:1 


8:1 


Figure  3.14.  Slant  Transform 
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COMPRESSION  - RMS  ERROR  SCENE  002  TRUCK  ON  BRIDGE 


KL  FOURIER  DLB SLANT  HAD4MARD 


COMP 

RMS 

ERROR 

COMP 

RMS 

ERROR 

COMP 

COMP 

RMS 

COMP 

RMS 

■a 

3.2556 

3.87 

5.712 

2.13 

4.008 

2:1 

4.381 

2:1 

5.66 

19 

4.696 

5.12 

6.649 

5.33 

5.374 

4.26 

4.862 

4.26 

6.88 

8:1 

6.129 

7.1 

5.88 

8.0 

5.953 

8.26 

6.358 

8.26 

6.90 

16:1 

7.176 

12.11 

— 

6.8 

TABLE  3 . 1 


RMS  ERRORS 


TRUCK  ON  BRIDGE 


Figure  3.17 
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Figure  3.18.  Compression  for  Three  Scenes  Karhunen- 


Those  projections  having  less  than  a specified  energy  are 
modified  by  setting  them  to  zero.  The  ratio  of  the  total 
number  of  basis  vectors  or  projections  to  the  number  not  set 
to  zero  is  the  component  compression  achieved.  Reconstruc- 
tion is  achieved  by  taking  the  inverse  transform  of  the 
modified  projections. 

It  is  therefore,  possible  to  improve  on  the  earlier 
"notch"  pattersn  used  in  component  compression  by  seeing  how 
the  positions  of  frequency  components  used  are  located 
according  to  energy  importance. 

The  energy  in  the  sampling  approach'  was  approximately 
95-96%.  In  this  energy  thresholding  technique  energy  values 
have  been  increased  to  a range  of  99.3%  to  99.94%  of  total 
energy  available.  The  difference  is  in  the  components  of 
the  frequency  domain  selected.  These  projections  (components) 
were  ordered  according  to  energy  values  calculated  in  the 
frequency  domain.  The  first  N components  out  of  M possible 
were  selected  to  yield  the  appropriate  component  compression. 
The  energy  in  the  M-N  components  account  for  the  error  in 
the  reconstructed  image.  The  patterns  of  an  L-notch  type 
filter  used  previously  can  then  be  compared  to  the  components 
selected  by  energy.  Figures  3.20,  3.21  and  3.22  compare 
these  patterns  with  increasing  energy  for  the  slant  transform. 
It  becomes  obvious  that  the  notch  patterns  previously  used 
are  indiscriminately  eliminating  high  energy  component,  there- 
fore, contributing  to  a higher  error. 

From  another  view,  the  unsorted  energy  (variance)  and 
sorted  variance  may  be  compared.  Figure  3.23  illustrates  a 
few  components  in  the  low  and  mid-range  which  are  unsorted. 
Figures  3.24-3.26  demonstrate  how  this  is  changed  for  the 
DLB , SLANT,  and  JIadamard.  Ene  compression  was  then 
implemented  using  the  DLB,  SLAN^ , and  Hadamard. 

The  corresponding  RMS  (root  mean  square)  and  RMS 
correlated  errors  are  tabulated  for  the  three  transforms. 
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46 

Two-dimensional  Frequency  Pattern 

Slant  Transform 

Component  Compression  4 : 1 

Note*.  Each  Block  Indicates  a Frequency  Position. 
Each  Number  Indicates  Energy  Order. 

Blank  Position  Indicates  Component  Not  Used. 

Figure  3.20. 
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F(U,  1) 
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Term  Sub-image  Frequency  Position  F(1,V) 
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Two-dimensional  Frequency  Pattern 

Slant  Transform 

Component  Compression  8 : 1 

Note:  Each  Block  Indicates  a Frequency  Position. 
Each  Number  Indicates  Energy  Order. 

Blank  Position  Indicates  Component  Not  Used. 

Figure  3.21 
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Two-dimensional  Frequency  Pattern 

Slant  Transform 

Component  Compression  10  1 

Note:  Each  Block  Indicates  a Frequency  Position. 
Each  Number  Indicates  Energy  Order. 

Blank  Position  Indicates  Component  Not  Used. 


Figure  3.22 


Component  Number 

Figure  3.23.  Variance  as  a function  of  comp<  .ent  number 
for  low  and  midrange  components  of  the  DLB 
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Figure  3.24. 


Sorted  variance  as  a function  of  component 
number  for  DLD . 
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number  for  Slant. 


1 07 


ZHG  ?8B  ENG  (JOV)  <NCG) 

• 1 TO  *00  1 93E  *00  ? 87(  ♦OP  3 801  *00  4 73€  *00  5 . 66E  *00  6.  60t  *00  7 53C  *00  8. 46E  *00  9 40  *00  1 03E  *01 
0 1001  *01  i 


a no  *u? 


0 1601*0?  I 

I 

l 

I 

azi®*® 

I 

I 

I 

a?»a*®  i1 


ana  *a? 


a 360  *00 


a 4ia  *a? 


a«6o*® 


0 510*® 


o mo  *® 


0 610  *® 


Figure  3.26. 


Sorted  variance  as  a function  of  component 
number  for  lladamard. 
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See  Table  3.3.  There  is  no  question  that  the  Hadamard  is 
the  poorest  transform  based  on  this  criteria,  followed  by 
the  SLANT  and  the  DLB  being  the  best  of  the  three.  The 
correlation  values  of  the  error  image  on  the  SLANT  and  DLB 
are  less  than  the  Hadamard.  A low  correlation  means  less 
dependence  of  neighboring  cell  in  the  error  image,  therefore 
less  redundant  information  not  capitalized  upon.  Stated 
simply  the  SLANT  and  DLB  energy  compression  is  more  efficient 
in  reducing  redundancy.  Typical  error  pictures  for  Scenes 
1 and  2 (tank  and  truck  on  bridge)  are  shown  in  Figures  3.27 
and  3.28.  These  pictures  indicate  that  the  errors  are  pre- 
dominant around  the  edges  of  the  target.  The  high  spatial 
correlation  is  reflected  in  higher  values  of  correlated  RMS 
error,  showing  usefulness  of  the  correlated  RMS  error  as  a 
measure  of  the  magnitude  and  spatial  distribution  of  errors. 

As  stated  under  theoretical  background  the  correlation 
measure  is  dependent  upon  the  distance  used  in  comparing  the 
probability  of  occurrence  of  any  two  gray  levels.  A few 
graphs  of  the  correlation  measure  as  a function  of  distance 
is  given  in  Figures  3.23,  3.30  and  3.31.  The  reconstructed 
images  are  shown  in  Figures  3.32  and  3.33. 

5 . Optimum  Bit  Encoding 

The  bit  encoding  procedure  must  determine  the  optimum 
number  of  projections  to  use  and  the  bits  per  projection. 
Therefore,  given  that  a finite  number  of  bits  are  to  be 
distributed  among  a maximin  number  of  projections  (N  OPT) , 
the  optimum  number  of  projection  will  be  determined  as  well 
as  the  bits  per  projection.  The  optimizing  criteria  is 
the  overall  mean  squared  error.  The  number  of  bits  necessary 
for  each  projection  is  determined  by  the  distribution  and 
variance  of  that  particular  component  taken  over  the  total 
image.  The  actual  value  of  the  projection  within  this 
distribution  is  quantized  by  minimum  variance  quantization.  [2] 
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TABLE  3.3 


SCENE  1 8:1  ENERGY  COMPRESSION 


TRANSFORM 

% ENERGY 

RMS  ERROR 

LAG 

CORRELATION  D1ST 

CORRELATED 
RMS  ERROR 

HAD 

99.84 

.3264489E01 

.95214E-01 

1 

.31082E00 

.93699E-01 

2 

.30588E00 

.91 33 1 E— 01 

5 

.29814E00 

.88209E-01 

10 

.28796E00 

SLANT 

99.85 

.31572E01 

.92728E-01 

1 

.29276E00 

.92187E-01 

2 

.291 05E00 

.89653E-01 

5 

.28305E00 

DLB 

99.01 

.31 12585E01 

0.8236E-01 

1 

. 259 1 6E  00 

0.8405E-0 

2 

.26163E00 

0.8253E-01 

5 

. 25688E00 

0.7974E-01 

10 

.2482E00 

SCENE  2 

8:1 

ENERGY  COMPRESSION 

TRANSFORM 

% ENERGY 

RMS  ERROR 

LAG 

CORRELATION  D1ST 

CORRELATED 
RMS  ERROR 

HAD 

99.84 

.46239E01 

. 1 1 6777E00 

1 

.53997E00 

. 1 18379E00 

2 

.54738E00 

. 1 1 6557E00 

5 

.53895E00 

SLANT 

99.85 

.44744E01 

. 1 1 2432E00 

1 

.50306E00 

. 1 15880E00 

2 

.51849E00 

. 1 16129E00 

5 

. 5 1 96 1 E 00 

. 1 14492E00 

10 

.51205E00 

DLB 

98.95 

.44234E01 

.9715E-01 

1 

.4297E00 

.101 4E  00 

2 

.4485E00 

. 1 0469EOO 

5 

.46308E00 

no 


Figure  3.29.  Correlation  Function  vs.  Distance. 
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Correlation 


* 


Scene  2 


Figure  3.30.  Correlation  Function  vs.  Distance. 
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Figure  3.32.  Energy  Compression  - Scene 


a . Minimum  Variance  Quantization 
Since  a digital  transmission  system  has  a finite  trans- 
mission rate  a quantizer  must  sort  the  input  signal  into  a 
finite  number  of  ranges,  N.  The  input  signal  is  in  general 
a floating  point  value  representing  a transformation  of  a 
data  vector  in  the  frequency  or  transformed  domain.  For  a 
given  N the  system  is  described  by  specifying  the  end  point, 
x , of  the  k input  frequency  component  and  y an  output 
level  corresponding  to  each  input  range.  Of  course,  the 
input  frequency  components  correspond  to  the  number  of  pro- 
jections retained  in  the  transformation  for  transmission. 
Each  component  will  have  a distribution  which  indicates  the 
variation  of  values  that  components  may  take  on.  The 
optimizing  criteria  is  the  minimum  (overall)  mean  square 
error.  This  criteria  may  be  utilized  by  defining  the  dis- 
tortion of  the  quantization  process  as  some  statistic  of 
the  quantization  error.  The  distrotion,  D,  may  be  defined 
as  the  expected  value  of  f (e) , where  e is  the  quantization 
error  and  p(x)  is  the  density  function  of  the  input  ampli- 
tudes. The  function  f(e)  is  assumed  to  be  a differentiable 
function . 

Then  D = E (f (Sin-Sout) } 

N yK+l/2  0 


= E 


K 1 yK-l/2 


(x-yK)  p ( x) dx 


where  x.+1  - xi  = yK+1/2"  yK+l/2=  Ay'  an  input  ran<?e- 


The  distortion  in  terms  of  each  step  is,  therefore: 


(x-yK) 2 = / 


yK+l/2 


(x-yR) 2 p ( x) dx  = a 2 

yK- 1/2 

where  p(x)  is  considered  a constant  over  range  of  integration 
and  equal  to: 
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p(y  avg) 
Then : 


. YK+l/2  YK- 1/2  , 

- 2 for  intervals  spaced 

sufficiently  close. 


2 _ P^yavg* 
°K 


<yK+l/2  yK^  + *yK  yK- 1/2 1 ^ 0 


W) 

= P(Yavg}  [-(yK+l/2-  yK)2  + (yK~  yK-l/2)2]  = 0 


°r  yK  = 


„ yK+l/2  + yK- 1/2 
2 


thus  the  condition  for  minimizing  ■'  is  to  locate  the  output 
value  yK  halfway  between  YK+1/2  and  yK-l/2’ 

Therefore : 

Ay, 


'K+l/2  yK  + 2 

Ay 


K 


YK- 1/2  yK  2 

Substituting  into  above: 


K 


2 _ (AyK)3 


K 


12—  P(yK> 


The  total  mean  square  error  voltage  is  found  by  summing  over 
all  levels  (total  distortion): 

D = h p<yK><AyK)3 

As  given  by  the  paper  of  Joel  Max  the  distortion  given  the 
P component  tangent  at  a point  is  given  empirically  as: 

[2] 

D = a 2 N q 
P 
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N is  the  number  of  levels 
, _th 

variance  of  P components 

D total  distortion  for  N levels 
q a constant  to  be  determined 


The  distortion  or  total  mean  squared  error  of  any  fre- 
quency component  can  be  measured  as  a function  of  the  number 
of  bits  allocated  or  number  of  levels.  The  range  of  the 
component  which  is  divided  into  these  levels  is  determined 
by  the  variance  of  the  frequency  component.  A plot  of 
distortion  as  a function  of  these  levels  for  several  fre- 
quency components  is  shown  in  Figure  3.34.  From  these 
plots  it  is  apparent  that  large  variance  (high  energy) 
components  must  be  divided  into  a higher  number  of  levels 
to  minimize  the  distortion. 

In  our  program  this  quantization  error  is  given  by: 


Q ( I ) = o 
. th 


■ 2*Nbits (I) 


th 


where  I is  the  I projection. 

c is  1.78  for  normal  distribution  and 

Nbits  is  the  number  of  bits  assigned  to  the  I 
component  [2]. 

The  truncation  error  due  to  using  p projections  out  of  a 
total  number  of  NPR  projections  is  given  by: 

NPR  P 

T(p)  = Z VAR  ( I ) - Z VAR  (I) 
i=l  i=l 


P 

= L - Z VAR(I) 
i = l 

where  VAR(I)  is  the  variance  of  the  I*1*1  component. 
The  total  error  is  given  by  [2): 
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p p 

Total  Error  = L - E VAR(I)  + £ VAR(I)  C 

i=l  i=l 


-2Nbits(I) 


Truncation  error  min  Var.  quantization  error 

An  approximate  solution  to  the  problem  posed  above  can 
be  obtained  by  treating  the  discrete  variables;  in  the  above 
as  continuous  variables.  [3]  The  value  of  NOPT  is  given  by: 

NOPT  = min  < VAR ( p)  - CON1 /p* [D (p) ] l , 

1<d<NPR  V J 


where : 


CON  = n VAR ( I ) /C 
1 = 1 


2* MB ITS 


D(p)  = 1.0  + loge  VAR(p)  + 


2* A*MBITS 


- - E log  VAR (I) 
P T = 1 e 


A = log  C 
e 


The  bit  assignments  are  given  by: 


NBITS(J)  = 0.5*Loge(VAR(J)  ) + 

, NOPT 

2xNOPT  ^ Loge  (VAR (I)  ) 

Since  NBITS(I)  was  treated  as  a continuous  variable  above, 
NOPT 

j=l  NBITS ( J)  = MBITS  will  not  be  true  after  we  round  off 

each  NBITS (I)  to  an  integer.  To  make  this  equality  hold,  we 
need  to  remove  or  add  1 bit  at  a time  to  the  array  NBITS. 

See  Appendix  II. 
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The  value  of  C = 1.78  used  for  normal  distribution  must 
be  verified  for  this  process.  If  the  distortion  as  a func- 
tion of  number  of  levels  is  plotted  on  a log-log  graph  then 
the  tangent  at  a point  will  be  the  slope  of  the  curve  at 
that  point.  This  is  the  value  of  g in  the  equation: 


D = 0 2 N q 


then  by  taking  the  log  of  both  sides  we  have: 
loq  D = log  Up2  ~ q log  N 

or  * -i  2 

. _ log  o 

_ y __  log  D _ ^ p 

log  N log  N 

The  quantization  error  given  for  our  program  is 


Q ( I ) = o2 (I)  C 


■2  Nbits 


therefore  N q = C ^ Nbits  ^ variances  Qf  these  ex- 

pressions are  equal.  Then: 


C = Anti-log  Visits 

In  our  optimization  process,  however,  the  total  distortion 
is  not  only  based  on  the  quantization  error  but  also  the 
truncation  error.  This  allows  the  (Nbits),  number  of  bits, 
to  change  for  any  component  variance.  Therefore,  C must  be 
related  to  how  close  the  distribution  is  to  a normal  dis- 
tribution for  a given  variance. 

A further  test  to  determine  how  close  the  frequency 
distribution  of  a component  is  to  a normal  distribution 
can  be  achieved  by  the  Mann-Whi tney-Wi Icox  text.  [3] 

Let  X and  Y be  stochastically  independent  random  variables 
of  the  continuous  type. 

Let  F(x)  and  G(y)  denote  the  distribution  functions  of  X 
and  Y . 

Let  X.,  X_ , . . . , X and  Y..  , Y_ , . . . ,Y  denote  independent 
1 2 m 1 2 n 
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samples  from  these  distributions.  Define: 


Z . . =1 

X.  < 

1 

= 0 

X.  > 

1 

and 

consider  the  statistic 

n 

m 

u = E 

E Z . . 

j = l 

i = l ^ 

and 

note  that 

m 

E 

Z . . 

i=l 

11 

counts  the  number  of 

values  of 

Y , j = l , 2 , . . . , n thus  U is  the  sum  of  these  n counts. 

The  smallest  value  that  U can  take  on  is  0 and  the  largest 
value  that  U can  take  on  is  ran.  Thus  the  space  of  U is 
{u:  p=0 , 1,  2,...,mn).  If  U is  large  the  values  of  Y 
tend  to  be  larger  than  the  values  of  X.  Thus  if  we  test 
the  hypothesis  that  : F(z)  = G(z)  against 


H 


1' 


F(z)  > G ( z ) 


the  critical  region  is  U > C. 

To  determine  the  size  of  the  critical  region  we  need  the 
distribution  of  U when  Hq  is  true. 


It  has  been  proven  that: 

u - HS 


[4] 


mn (m+n+1 
12 


is  N (0 , 1) 


therefore,  this  hypothesis  can  be  tested  for  various 
significant  levels.  U may  be  determined  by: 


U 


n (n+1) 
2 


where  T is  the  sum  of  the  ranks  of  Y^,  Y2'"‘"'Yn  amonc?  t^ie 

m+n  items  X,,...,X  , Y.,...,Y  once  this  combined  sample  is 
1 m 1 n 

ordered . 
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* 


* 


Therefore 


P 


r 


at  the  cx  significance 


r 


mn 

2 


mn (m+n+1 


12 


> N (*) 


level  we  have: 


= ix  N (x)  table  value 
(1-a)  of  normal 
distribution. 


for 


If  Pr 


U 


> 


/mn  (m+n+1 
V ' 12 


N(x) 


a 


The  histogram  of  these  components  is  shown  in  Figures 
3.35,  3.36  and  3.37.  The  question  which  must  be  answered 
is:  are  the  frequency  components  sufficiently  close  to  a 

normal  distribution  to  allow  the  value  of  C = 1.78  in  the 
optimum  bit  allocation  program  to  be  used?  The  results  of 
this  test  of  hypothesis  indicates  that  all  A-C  frequency 
components  are  sufficiently  close  to  normal  to  use  the  value 
C = 1.78  in  the  quantization  process.  The  d.c.  component  is 
neither  normal  nor  uniform  although  it  tends  to  appear  to  be 
a uniform  density  function.  For  this  reason  a value  of 
C = 1.78  in  the  minimization  process  was  used. 

The  value  of  C = 5.61  and  C = 1.78  was  compared  using 
the  bit  allocation  program  to  show  the  effect  of  changing 
this  value  from  the  normal  assumption.  A Hadamard  transform 
with  compression  of  1 bit/pel  was  used.  See  Figure  3.38. 

The  fact  that  the  image  with  a C value  of  1.78  is  much 
better  indicates  agreement  with  the  hypothesis  test. 

Reconstructed  images  of  the  Hadamard,  SLANT,  and  DLB 
using  this  optimum  bit  allocation  for  Scene  1 (tank  in 
ditch)  and  Scene  2 (truck  on  bridge)  are  shown  in  Figures 
3.39,  3.40  and  3.41.  Visually  these  reconstructed  images 
can  be  compared  to  component  compression  ratios.  The  1 
bit/pel  is  equivalent  to  6:1  component  compression,  while 
the  .75  bits/pel  and  .5  bits/pel  are  equivalent  to  8:1  and 
12:1  component  compression  ratios,  respectively.  Visual 
comparison  of  the  reconstructed  image  shows  that  .5  bits/pel 
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Figure  3.36.  Distribution  of  Components 
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Lant  Optimum  Bit  Allocation 


is  feasible  with  either  the  SLANT  or  DLB  for  a 512  x 512 
image.  The  Hadamard  has  an  increase  in  blocking  at  .5 
bits/pel. 

Using  the  variable  word  length  coding  scheme  is  more 
efficient  in  using  less  bits  for  components  with  less  energy 
contribution  and  more  bits  for  components  with  high  energy 
contribution.  The  quality  of  the  image  (cosmetic  effects) 
is  better  than  the  equivalent  component  compression.  Or 
stated  another  way,  for  the  same  RMS  error  an  additional 
1.25-1.5  compression  can  be  gained  by  optimum  bit  alloca- 
tion. 

In  terms  of  the  error  criteria  the  RMS  and  RMS  correlated 
errors  are  less  with  the  variable  length  code  as  may  be  seen 
in  Table  3.4  and  Figure  3.42  for  the  same  component  com- 
pression. It  was,  therefore,  determined  that  the  bit  encoding 
process  would  be  used  for  evaluating  the  "best"  transform 
and  the  interframe  process. 
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SCENE  2 

VISUAL  RMS  ERROR  CORRELATION  CORRELATED 

TRANSFORM  COMMENT  COMPRESSION  RATIO  AT  .5  bif/pel.  LAG  DIST  RMS  ERROR 

I. Obit/pel.  .75  bit/pel.  .5  bit/pel. 
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TABLE 


SECTION  IV 


A CRITICAL  COMPARISON  OF  FAST  TRANSFORMS 
1.  INTRODUCTION 

In  the  previous  chapters,  we  have  examined  the  error 
criteria,  the  optimum  transforms,  and  utilizing  a few  of  the 
leading  transforms,  seen  results  of  fixed  and  adaptive  com- 
pression schemes.  In  this  chapter  we  must  examine  the  per- 
formance of  each  transform  in  the  light  of  our  chosen  error 
criteria  and  bit  allocation  algorithm.  The  choice  of  the 
"best"  transform  may  then  be  utilized  in  the  interframe 
process . 

In  the  search  for  the  "best"  fast  transform  other  re- 
searchers have  compared  their  most  popular  transform  with  a 
few  of  the  other  fast  transforms.  However,  in  the  rapid 
growth  of  dimensionality  reduction  as  applied  to  images,  a 
comparison  of  all  fast  transforms  at  this  time  has  not  been 
made.  The  comparisons  which  are  reported  in  the  literature 
must  also  be  viewed  critically  because  each  researcher  is 
experimenting  with  different  size  images  digitzed  to  10,  8, 
and  6 bits  utilizing  differ  :it  compression  schemes.  The 
error  criteria  has  also  been  a variable  in  the  past.  Some 
researchers  use  only  visual,  others  use  RMS  but  the  scaling 
of  this  error  criteria  has  made  it  difficult  to  compare  the 
performance  of  different  researchers.  It  is  therefore,  the 
intent  of  this  investiation  to  utilize  the  identical  coding 
algorithm,  error  criteria,  and  representative  image  to  reduce 
the  variability  to  a minimum,  and  report  the  performance  of 
each  fast  transform  available  for  image  compression. 

2 . Translation  by  the  Mean 

Prior  to  examining  the  results  of  all  transforms  it  is 
desirable  to  modify  the  image  input  data  to  insure  the 
minimum  mean  square  error  is  achieved  for  each  transform 


1 3 6 


process.  This  is  accomplished  by  translating  the  vectors 
representing  a subimage  by  the  mean,  of  all  vectors  in  that 
image.  Returning  to  the  principal  components  we  can  see  why 
this  is  so. 

Define  P such  that  P : RN  -*  RN  and  is  a real  N x N 
matrix.  Then  P is  an  orthogonal  projection  operator  if  and 
only  if: 

2 

P = P 
P = P' 

Then  by  the  following  theorem,  the  translation  by  the  mean, 
in  conjunction  with  principal  components  minimizes  the  error 
(see  Haralick  [1]). 

N 

Theorem  2:  Let  x. , x9,...,x,  be  given  vectors  in  R . Let 

N N i z K 

P : R ■>  R be  an  orthogonal  projection  operator  onto  the 
subspace  spanned  by  M eigenvectors  of 

K 

Z x,x'  with  largest  eigenvalues. 
k=l  K K 
K 

Let  = Z (sR  - Pxk)  ' (xR  - PxR) 


Let  the  vector  Z be  the  arithmatic  mean  of  x^ , 


x 


k- 


Z = 


K 

Z x 
k = l * 


N N 

Also  let  P* : R •*  R be  an  orthogonal  projection  operator 

onto  the  subspace  V*m  spanned  by  the  M eigenvectors  of 
K , 

Z (x  - Z) (x  -Z)  with  largest  eigenvalues,  and 
k=l  K K 

K 

then  let  c ^ = Z ( (x.  - Z)-  P*  (x,  - Z ) ) * 

* k=l  k K 


( (xk  - Z)-  P* (xk  - Z) 


then  g ^ 2 > ^2' 
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This  theorem  is  proved  in  Appendix  III.  We  are  now  ready  to 
compare  the  results  of  all  transforms  in  terms  of  the  visual, 
RMS,  and  correlated  error  criteria. 

3 . Comparison  of  Transforms 

The  truck  scene  was  selected  as  representative  of  typical 
Remotely  Piloted  Vehicle  image.  The  man-made  structure  as 
well  as  the  background  requires  resolution  performance  to  re- 
produce the  gray  level  changes  and  edges.  Other  scenes 
available  for  this  examination  were  considered  spatially  too 
flat.  This  scene  is  a 512  x 512,  6 bit  digital  image.  The 
image  was  segmented  into  1024,  16  x 16  blocks,  or  one  dimen- 
sional vectors  of  length  256.  The  following  transform  pro- 
cesses were  compared: 


1. 

Hadamard 

5. 

DLB  (8 • 2 • 8 • 2) 

2. 

Fast  Fourier 

basis  set 

3. 

Slant 

6 . 

DCT 

4. 

DLB  (16  basis  set) 

7. 

Fast  K-L. 

a . Visual  Criteria 

Figures  4. 1-4.9  depict  the  reconstructed  images  and  their 
corresponding  error  images  for  compression  ratios  of  6:1,  8:1, 
and  12:1.  In  terms  of  average  bits  per  picture  element  these 
are  1 bit/pel,  .75  bit/pel  and  .5  bits/pel.  There  is,  of 
course,  a general  degradation  from  1 bit/pel  to  .5  bits/pel. 
This  is  most  noticeable  on  the  fender  of  the  truck  and  by  the 
loss  of  the  exhaust  pipe  near  the  cab.  The  visual  errors  are 
more  easily  discerned  for  any  one  compression  by  the  error 
images.  The  increase  in  detail  of  the  bank,  contrast  on  the 
railroad  tracks  and  bridge  are  indicative  of  poorer  perfor- 
mance. It  appears,  therefore,  that  all  images  are  acceptable 
at  the  1 bit/pel  compression  but  only  the  DLB  (8-2*8-2)  , 
Discrete  Cosine,  and  Fast  K-  L Transforms  are  still  adequately 
performing  at  the  required  .5  bits/pel.  Table  4.1  classifies 
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TABLE  4.1 


Trans  form 

Visual 

Tabulated  Errors 

Comment 

1.0 

bit/pel 

VS  • 

Compression  Ratio  ; 

RMS 

Lag  Dist. 

Correlation 

Correlated  i 

Error 

Fourier 

Poor 

4 . 0078 

1 

. 10468879 

.4195717326 

4.0078 

2 

.10909377 

.4372260114  \ 

4.0078 

5 

. 10897077 

.436733052 

4.0078 

10 

. 10781116 

.432085567  ! 

Hadamard 

Fair 

3.8528 

1 

. 10758324 

.4144967071 

3.8528 

2 

.11060267 

.426129967 

3.8528 

5 

. 11497447 

.442973638 

3.8528 

10 

. 1078733 

.4156142502 

Slant 

Acceptable 

3.7392 

1 

. 10662527 

.4040244731 

3.7892 

2 

.11079641 

.4198290  j 

3.7892 

5 

.11075597 

.4196765215  j 

3.7892 

10 

. 10843463 

.4108805 

DLB 

Acceptable 

3.9547 

1 

. 1082334 

.4280306  1 

(16x16) 

3.9547 

2 

. 1111686 

.4396384 

3.9547 

5 

. 11195713 

. 4427567  j 

3.9547 

10 

. 1093633 

.432499  ! 

DLB 

Good 

3.7667 

1 

. 10636617 

.4006491 

CM 

00 

CN 

00 

3.7667 

2 

. 10964318 

.4129926 

3.7667 

5 

.10953416 

.4125820  ! 

3.7667 

10 

. 1069852 

.4029811 

DCT 

Good 

3.6503 

1 

. 10554689 

. 3852774 

3.6503 

2 

.11048985 

.4033209 

3.6503 

5 

. 10939932 

. 3993402 

3.6503 

10 

. 10767923 

. 3930613 

Fast  K-L 

Excellent 

3.6278 

1 

. 10535227 

. 38219696 

to  Good 

3.6278 

2 

. 10979803 

. 39832529 

3.6278 

5 

. 10927618 

. 39643212 

3.6278 

10 

. 10743651 

. 38975817 
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TABLE  4.1  (Continued) 


Trans  form 

Visual 

Tabulated 

Errors 

Comment 

. 75 

bit/pel 

VS  . 

Compression  Ratio 

RMS 

Lag  Dist. 

Correlation 

Correlated 

Error 

Fourier 

Poor 

4 . 3781 

1 

. 10641426 

. 4658922717 

4.3781 

2 

. 11174843 

. 4892458014 

4 . 3781 

5 

.11301188 

. 4947773118 

4.3781 

10 

.11126707 

. 4871383592 

Hadamard 

Poor 

4.2293 

1 

. 10843291 

.4585953063 

4 . 2293 

2 

.11205909 

. 4739315093 

4.2293 

5 

.11222098 

. 4746161907 

4.2293 

10 

. 10958335 

. 4634608622 

Slant 

Acceptable 

4 . 1327 

1 

. 10767135 

.4449733881 

to  Fair 

4.1327 

2 

. 11225209 

.4639042123 

4 .1327 

5 

. 11230641 

.4641287006 

4.1327 

10 

. 11046972 

. 4565382118 

DLB 

Acceptable 

4 . 2816 

1 

.10952806 

. 468955 

(16x16) 

to  Fair 

4.2816 

2 

. 1135467 

. 4861615 

4.2816 

5 

. 1141603 

.4887887 

4.2816 

10 

. 11178365 

. 4786126 

DLB 

Good 

4.0683 

1 

. 10698853 

.4352613 

00 

NJ 

00 

ro 

4.0683 

2 

. 11135341 

.453019 

4.0683 

5 

. 1119608 

.4554901 

4.0683 

10 

. 10978346 

.4466318 

DCT 

Good 

3.9727 

1 

. 10626637 

.4221641 

3.9727 

2 

.11134476 

. 442339 

3.9727 

5 

. 11103255 

.4410988 

3.9727 

10 

. 10885344 

.4324419 

Fast  K-L 

Good 

3.9567 

1 

. 10573179 

. 4183489 

3.9567 

2 

. 11113668 

.43973450 

3.9567 

5 

. 11053920 

.43737045 

3.9567 

10 

. 1089543 

.4310994788 
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TABLE  4 . 1 


(Continued) 


Tabulated 

Errors 

Transform 

i Visual 

vs 

. 

Comment 

. 5 

bit/pel 

Compression  Ratio 

RMS 

Lag  Dist. 

Correlation 

Correlated 

Error 

Fourier 

Poor 

4.9412 

1 

. 11130523 

. 5499814025 

4.9412 

2 

. 11556605 

. 5710349663 

4.9412 

5 

.1179737 

. 5829316464 

4.9412 

10 

. 11637595 

. 5750368441 

Hadamard 

Poor 

4.7767 

1 

. 11049775 

. 5278146024 

4.7767 

2 

. 11396599 

. 5443813444 

4.7767 

5 

. 11497440 

.5491982165 

4.7767 

10 

. 11288615 

.5392232727 

Slant 

Acceptable 

4.5942 

1 

. 10924148 

. 5018772074 

to  Poor 

4.5942 

2 

.11414770 

. 5244173633 

4.5942 

5 

.1156673 

.5313987097 

4 . 5942 

10 

.11365033 

. 5221323461 

DLB 

Acceptable 

4.7318 

1 

.11010933 

. 5210151 

(16x16) 

to  Poor 

4.7318 

2 

. 1144489 

.4515493 

4.7318 

5 

.11584558 

.5481577 

4.7318 

10 

. 11358686 

. 53747 

DLB 

Acceptable 

4.5241 

1 

. 10965364 

. 4960838 

( 8 * 2 • 8 • 2 ) 

to  Fair 

4 5241 

2 

. 1143836 

. 5174828 

4 . 5241 

5 

. 1154832 

.5224575 

4.5241 

10 

.1131099 

.5117204 

DCT 

Good 

4.43664 

1 

. 10691045 

. 4743229 

4 .43664 

2 

. 11386058 

. 505158 

4.43664 

5 

.11446434 

. 5078368 

4 .43664 

10 

. 11266079 

.4998349 

Fast  K-L 

Good 

4 . 4292 

1 

. 10692673 

.4735998 

4.4292 

2 

. 11347968 

. 5026241987 

4 . 4292 

5 

.11412505 

. 5054826 

4.4292 

10 

.11246462 

. 498128294 
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the  performance' based  on  the  compression  ratio  and  quality. 
The  subjective  judgment  can  be  aided  by  enlargement  of  these 
images.  The  .5  bit/pel  images  were  blown  up  to  see  the 
effect  of  blocking  in  the  image  (Figures  4.10-4.16).  This 
indicates  that  the  DLB  (8*2*8*2)  , Discrete  Cosine,  or  Fast 
K-L  is  adequate  for  the  higher  compression  ratios. 

b.  RMS  Error  Criteria 

The  second  criteria  we  wish  to  investigate  is  the  RMS 
error  criteria.  Figure  4.17  illustrates  the  error  as  a 
function  of  compression.  In  terms  of  the  most  optimum,  the 
Fast  Karhunen  Loeve  outperforms  the  rest.  This  performance 
is  followed  closely  by  the  Discrete  Cosine  and  the  Discrete 
Linear  Basis  with  the  8*2*8*2  basis  set.  Although  the  Fast 
K-L  transform  more  closely  approaches  the  Karhunen  Loeve 
the  basis  set  is  highly  dependent  on  the  image  and  is  rather 
lengthy  to  compute  for  the  number  of  images  to  be  used  in 
the  interframe  process. 

The  optimality  of  the  Fast  K-L  transform  with  just  two 
layers  infers  that  the  other  transforms  would  also  have 
minimum  error  with  the  use  of  just  two  layers.  This  appears 
to  be  true  for  the  Discrete  Cosine  but  was  not  evident  for 
the  DLB.  In  the  process  of  generating  the  DLB  basis  set, 
two  variables  must  be  selected  (i.e.,  r,s).  These  two 
variables  may  take  on  an  infinite  number  of  positive  and 
negative  integer  values.  There  is  no  guarantee,  therefore, 
that  the  optimum  has  been  selected.  It  is  further  true  that 
the  bit  allocation  assumes  uncorrelated  coefficients  as  in- 
put and  since  this  particular  set  is  more  correlated  than 
the  other  transforms  tested,  a poorer  bit  assignment 
result . 
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nlorgement  of  the  Discrete  Cosine 
at  .5  bit  pel 
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Enlargement  of  Hadamard 
at  .5  bit  pel 
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Enlargement  of  Fourier 
at  .5  bit  pel 

Figure  4.14 


Enlargement  of  Fast  KL 
at  .5  bit  ' pe  I 
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c. 


Corre lation 


To  determine  the  correlated  error  in  Table  4.1,  the 
correlation  values  were  calculated  for  each  transform  at 
lag  distances  of  1,  2,  5,  10.  The  spatial  correlation  of 
errors  indicates  that  all  transforms  have  less  correlation 
at  distance  one  and  the  Fast  K-L  is  the  most  efficient  in 
terms  of  this  criteria.  The  Discrete  Cosine  also  has  low 
correlation  at  all  distances  compared  to  the  other  trans- 
forms. This  is  illustrated  in  Figures  4.18-4.20.  The 
correlation  is  plotted  as  a function  of  lag  distance  for 
the  three  compression  ratios. 

d.  Basis  Vector  Comparisons 
In  addition  to  looking  at  the  error  criteria,  it  is 
informative  to  investigate  the  sequency  properties  of  the 
basis  vectors  used.  The  discrete  transforms  generally  ex- 
hibit zero  crossing  properties  called  sequence.  It  is 
suggested  in  the  literature  [2]  that  as  the  subimage  size 
grows  larger  the  Discrete  Cosine  basis  set  approaches  a 
correlation  matrix  with  a first  order  Markov  assumption. 

This  is,  however,  only  true  if  the  image  approaches  a 
Markov  Process.  To  clarify  this  suggestion  in  the  literature 
a correlation  matrix  of  a first  order  Markov  assumption  was 
generated  with  a correlation  coefficient  equal  to  0.95. 

This  matrix  is  a class  of  toeplitz  matrices  of  the  form 


P 

1 


P 

P 


M-l 

M-2 


<J>  = 


0 < p < 1 


M-l 


M-2 


M is  equal  to  the  subimage  size. 


Correlation 


Correlation  as  a Function  of  0. 75  bit/pel 


o- Slant 


0. 120  p • DLB  (16) 


Lag  Distance 


Figure  4.19 
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Slanl 
DLB  (16) 
DLB  (8) 
DCT 

Fast  K-L 


j i i 

4 5 1( 

Lag  Distance 


Figure  4.20 


The  first  order  assumption  gives  the  relation  of  gray 
level  dependency  of  the  resolution  cells.  If  the  subimage 
is  considered  as  a vector  and  we  know  the  conditional  pro- 
bability of  random  variables  X^,  X^, . . . ,X^r  and  for  any  n 

we  have  Pr.  (X  |X  , X X.)  = Pr.  (X  lx  ,). 

n1  n-1  n-2  1 n1  n-1 

This  matrix  and  its  normalized  eigenvectors  are  given 
in  Figures  4.21  and  4.22.  The  Discrete  Cosine  basis  vectors 
are  then  compared  to  the  Markov  correlation  matrix  eigen- 
vectors. Other  than  an  obvious  phase  shift  the  vectors 
are  almost  identical  in  terms  of  sequency  and  magnitude 
(Figure  4.23).  The  other  transforms  which  are  easily  com- 
pared are  the  Discrete  Linear  Basis,  Hadamard  and  Slant. 

See  Figures  4.24,  4.25,  4.26  and  4.27.  Notice  particularly 
the  variation  in  sequency  one,  and  fifteen  compared  to 
the  Discrete  Cosine. 

Based  on  the  results  of  the  error  criterion  and  results 
of  sequence  plots  the  "best"  transform  to  be  implemented  in 
the  interframe  process  is  the  Discrete  Cosine.  The  Fast 
K-L  transform  was  not  utilized  based  solely  on  the  time 
factor  to  generate  the  basis  vector  sets  for  each  of  the  12 
images  to  be  used.  Further  investigation  with  regard  to 
classes  of  pictures  for  this  transform  must  first  be  com- 
pleted (see  Recommendations,  Chapter  7). 


162 


$ 


,000000  0. 
°. 66742° 

, 0 c ’ . 

.6  .an1 

, 9 0 7 6 ° 0 . 

. 7?*  0 9? 
,qn’17«  C . 

. 77’78  1 
•R1450(  . 

.8 14506 
,777781  C . 
' .857375 

,71*007  0. 

0 • 9 025  ( 

, * 0 P 7 7 ’ . 

. o arv  l 
,**74?0  0. 
’ , < 

,*3074°  c. 
0,05 

.'■98737  C. 

.T'-is-'P 

,56861 

t fl  f "7  1 7 r. 

,54  360  t . 
0.81 45  6 

, P 1 7 74?  0. 

, 7717fl 1 

,487675  0, 

, 7 t*0q  7 

,463?91 

0.*9P->  2 -» 


95  00.' 

' .67024° 

- ' r - - t | 

0.66343° 
05 0000  1 . 

7 

o 25CC  1 , ' 
C, 73400. 7 
66777*  0 . < 

.77378’  : 
P1450f  0.1 

0.814506  1 
77378!  C.< 

0.84777*  ' 

78*  Q 7 

'.0174  . 

698337 
.0  r 

663420  0.( 
■ . 0 0000  I 
630  24°  n,t 
.0  4(7000 
608777  0.1 
0 . 0 C ? 4 ( 

568  8'  0 0. 

( . 8 4 777* 

540960  0.' 
r, , 8 1 4 4 0 * 

417747  1 . 

7 . 7 77  7P  1 

487678 

6. 774097 


9 7*  r .85797* 

0 1 7 7 v P ' 

-7  *5  c - Qp 

" 

tgs  n 

- 

.698777  ~.4*8P'4 

' .<-*767 

C 

" w A “3 

* 

040  7 . 9' 7*00 

Q t *»9  *»C 

# 0 i ^ c - A ^ 

■’■’17C  1 

’it'C 

2 

.67-740  (-.608777 

%*4 

' # t 9 Q ^ 

* /ST 
• “* 

0CC  000  . o * ' 7 0 

# O ' 1C.'  r r 

oc-r-jic 

3 ' '‘A 

"•  "»  -5  “*  S 

, 6 6 7 4 2 ■"  ".6  3 074° 

. >£  ' 

^ # c,  4 c c '■'p 

" # c ~ 

" . c : ’ 

^2 

o*rr  j . 00°  0 

G C 

,9  2^^  ? 

Q c T 0 T C 

<- 

Q - 

4 

.698337  C. 66342° 

0.6°  ? /.  ° 

C,*gP'»t’ 

,uocA 

*5  fc  p 

902'.  ° 0 .06000°  ] 

0 ^ C ^ r'  ° 

r)cnn°^ 

9 r 2^rfS 

‘ 

Q C -*  T “• 

e 

0.736052  C .6983  37 

0,663420 

: .t987'» 

•7 

: .56? 

80: 

8*7774  .9  26 

. -f  0 ‘ 1 

gc*  ^ r r\ 

p 

Q ° 7 t ° 

p 

" . 77378 1 C . 7360  97 

: . 6°8  7 7 7 

C *^?r 

",*3'74C 

^ # c.  Q c 

T 7 7 

81440*  0.84777*  ' 

q ~ c ^ p 

O G P ^ ^ 9 

600000 

P 

9S?C  * 

P 

'.814606  ".77778’ 

. r ^ 5 o o ? 

n.^qp-33-7 

0 

p # t 1 “ 

7 4 9 

7777  8''  .814406  0 

# Q C T 7 “»  C, 

7 c ~ " 

gc"i,'°° 

*5  *5  P "• 

p 

C. 867774  O.P1440* 

".’’■’T'O' 

r'#71C 

P # 4 G Q "2  *1 

-» 

778097  0.77778’  0 

C 

# q c,  t 7 7 c. 

q p,  p g P /> 

- 

Q c ipp 

p 

. 0 0 7 6 ’ 0 0.84777* 

2.7737*1 

r\  “*  -3  c 0 G 

*> 

■“  # <,Q  P 

737 

* 0 8 3 7 7 0.77400, 7 0 

# 77  7 1 

.piiROf.  n 

«C>79’»t 

C * **  c ^ 

p 

0.9*0000  0. 002*70 

r # -pc, 

6>  -•  -7  1 -*  O 

• 

1 

^ Q p 

663420  ".698737  0 

7 c,  pn  t ^ 

• 7 7 3 7 8 1 0 

8 1 u t' 

O 

Q t t >a  1 

c. 

i , !000-10  .9*0000 

# 0 n ">  c.  r*p 

- .Q^cn 

f 

n a - - t 

**  c 1 

67.  249  0.6674?°  C 

• 6 9 8 *»  **  ? 0 

# 7 7 G p G p 0 

T737p’ 

p 

0 3 u c,  p 

A 

.)*  " ( j.0000  0 

^ 0 0 

r qP  t G ^ G 

r(at-’T» 

c 

*06 

598737  .67024°  0 

5420 

# *QQ  T 1 7 0 

7 7 * n Q p 

0 

"’■'^78 

1 

.9  2600  0. O*90 "0 

On 

# qc  ' rv  ' p 

O 0°  3C  ^ 

• 

r 

0.8^^ 

7 ■*  «• 

6 * 8 P 0 ° .*<>8777  9 

0 <7 1 0 7 4 0 0 

# • * j ..  y [\  ' 

AOtni-' 

■’n'c 

•> 

. 3 * 7 7 7 K 0 . °0  7*  ’'0 

,o«  0* 

< ■ ' ' 

n 0 c ^ ^ ^ 

p 

~ # G " ^ 

C <5  <5 

4 4 0 7*0  0.4*8800  O 

’ 7 - 

• ^ 'l-'  n 

f f-  7 u 2 ^ 

* 

^ Gfl-n 

- 

, P 1 4 t ' * 7.0*777* 

^ #0^3Rr\0 

> #oe 

1 ^ ^0000 

p 

^c; 

413742  .64  7*7 



# C,  Q Q T 7 7 T 

6^C?4Q 

*7  6.  -3  ^ 

".7777P1  0.8)4406 

. 1 ' ' " 1 

P on  c-  ° g 

Ot.°93 

• 

n 

9 (5  5 0 

• 

roc 

Figure  4.21.  First  Order  Markov  Correlation  Matrix 
> .95  (lbxl6)  Each  row  equals  two 

printed  lines. 


."“7  5459-C  . 70 5 70-0. 104998  ' .158410-0.170472-0,200875-0.229021-0,255!  07 

- . 7 70  1 T ">  ,i  7076-1'  . 5 1 7 p ’ l -r  , ii  ?0  86-9  . ’4?  1 c°  1 ’ . ’ 7 1 7 6 7 - 0 . ? ? 4 ■a  5 b 

- 3?47  .19747?  ,774467-  ,i?7489  9.76186?  0.745767  0.7'86]’  ".?44’68 

’ 684  1 7-  . R?  1,1-  . ‘‘4806  " 1 -0.  1 50  6 64-  . ?199’4  9.70689*,  7. 7 7 171  9-7.?74'J13 

.167079-0,294405-0.351722  1.325342-0.221985-0.064820  0.107357  0.253948 

-.1  70774-  . 747948  .264876  0.  120804-  . 64]’?  -1.716487  0 . i ] sp,  5 9-0 .74?  9 54 

- .’7461  1 v.  746668  0. 71  0806-0. 1 777?  1 -0.  lO'' 1 21 -0. 796607-0 . 350960-7,  ?459p-» 

-O.0284 36-0. ? 0?1 68  0.339749  0.7?00f-s  0.1505O9  0.091102  0 . 28 76 1 0-0.250? 3 1 

0.273341-C. 346303-0. 165965-0.136690  C. 338686  :.29?]ss  0.031381-0.252377 
-".760667  7.190999  0.109265  0.778791  0.302819-0.099299  0.235999-0.256102 
-7.7117’'-  ). 293399-0. 035970  0,326786-0.272057  0,070743  0.338544  0.247!  - 

- .105706  *346667-0.219423  0.140766  0. 3494^4-0.181500  0.177561-0.260534 

.778116-0.196961  0 . 2 74 488-0 . 7 76068 -0 . 0 3 5 6 2 1-0 . 346 6 78 -0 . 1 6 6 7 1 5 0. ?50796 

'.710716  ). 071270-0.350900-0. 130896  0, 2792*6-0. 283945  o . > j ’ ’ ? 8 - 0 . m ->  * ; 3 

- . ’Si  6 79  0.068787-0.338078  0,139959  0.311630  0* 1 95R82-0 . 273228-0. 29920 3 

0.224379-  » 2 9274  7— < .166998-0.324390  0, 103440-0. 3397 sq  0.03  7366-0.264  991 
.351574  .68877  0.338077  .134951—0.31 1631  <",196886  0.273276-0.249203 

-0*224380-0.292747  0.166998-0.324390-0.103440-0.339750-0.037366-0.264991 
-0. 738101-0. 19599e-0. 224486-0, 326060  9. 07 6 6 7 1 _0 , 746682  r.166?ir  ,.’607r<4 

-0.310215  .071271  O.35O9O  )-0. 1 30846-0, 274 2*6-0 . 283945-0. 1 1 0?? 5-0.263 503 

.’11676  .797477  0.035470  .’76777  0.272060  0.070747-0,338541  ,247( 

.1  S7  7 .146666  0.219423  0*140766-0. 349464-0. 181500-0. 17756  1-0.260534 

-0.273311-0.346330  < . 165467-0. 1 36683-0. 338586  0.292152-0.031 381-0.252376 
r .350654  . 1 9998—  *109265  0.328391 -0,3028 14-0.049? 44-0 .235999-0, 256 102 

.774476  0.346677-0.310612-0.133223  0,105116-  .295503  0.350945-0.245990 
9. 07847  4 -0,7  0?168-  0.77  9749  0. ”0986-0,  l*P4nQ  0 . 99 1 1 0 ’-0 . ?8 76 1 o_0,  ,’60771 
.’67047-  .794416  0. 751770  .77674’  0, 22 1 490.0 , 0648 l 8-0, 1 0 7355  0.253947 

-o.  ’ 797  76-9  4794  7-0, 264876  0.1  70897  b,r'c'UT'?  0 . ? 1 64  8 ’ -0  . ’ 1 60s  9 _0 . 76  ? 054 

7716  0.197476-0,274474-0, 327487-0.351854  • 345368-0. 30{  07  5.24437C 

- .158415-0.056042  0.048060-0*150554  0.239974  0.7  67  < - .771719-0.234313 

35446-0.070571  ).l  500]  ,i’84r'o  0.170473-  .200877  .229318-0,255511 

9.7791  7?  9 a 100074  C . 7 ] 789 ] -0. 7 7?0 8 6 0. ’471*9  0 . 745 8 1 7 -0 . ’ 7 1 7 6 7-C . ? 24 7 6 5 


Figure  4.22.  Eigenvectors  of  Correlation  Matrix  read  in 
columns  alternating  rows,  i.e.,  1st  vector 
contains  1st  element  of  rows  1,  3,  5 ( 7 etc 
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SECTION  V 


THE  INTERFRAME  PROCESS 

1 . INTRODUCTION 

The  previous  chapters  have  dealt  with  the  case  of  one 
mono-chromatic  frame.  To  consider  the  system  concept  the 
multi-frame  process  must  be  considered.  The  achievement  of 
large  compression  ratio's,  in  the  order  of  50:1,  rely  upon 
capitalizing  on  interframe  compression.  Relatively  few  re- 
search results  have  been  published  aside  from  theoretical 
results,  which  report  on  any  but  well  known  techniques  for 
images  in  which  there  is  motion.  The  latest  approach  dis- 
cussed by  Habibi  [1]  and  further  results,  by  the  University 
of  Southern  California  (SC)  [21,  use  a three-dimensional 
transform  technique.  The  difficulty  here  is  the  complexity 
of  three-dimensional  transforms  and  the  storage  required. 

In  the  more  published  DPCM  approach,  only  one  frame  need  bo 
stored  at  a time  and  the  implementation  is  relatively  easy, 
however,  the  compression  ratio  is  low. 

In  this  chapter  we  report  on  a suggested  combination  of 
transform  compression,  DPCM,  and  frame  rate  reduction.  It 
is  this  combination  which  evokes  the  necessary  50:1  compression 
for  the  system  concept. 

2 . Channel  Noise  for  the  Discrete  Cosine 

In  the  selection  of  the  Discrete  Cosine  transform  as  the 
"best"  for  the  interframe  study  we  must  concern  ourselves  with 
its  response  to  channel  noise.  To  yield  an  idea  of  how  this 
transform  might  perform  an  assumption  of  a Binary  Symmetric 
channel  with  bit  by  bit  transmission  was  made.  Although  not 
a complete  analysis  it  may  be  used  as  an  indicator  of  pro- 
bable performance  for  given  noise  probabilities. 
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In  any  given  transform,  the  variable  length  coding  scheme 
(Chapter  3)  assigns  a fixed  number  of  bits  to  a subimage. 
Under  this  constraint  we  can  consider  the  sequence  of  0's 
and  l's  as  a sequence  of  independent  Bernouilli  trials.  Th^ 
probability  of  receiving  (n-k)  correct  digits  or  exactly 
k erroneous  digits  is: 

n n-k  k 

P {k  errors}  = ( ) p q 

, n-k 

where : 

q = 1 “ P 

If  we  consider  an  N-dimensional  vector  communication 

channel  where  the  transmitter  is  defined  by  a set  of  M signal 

vectors,  (S^},  when  message  nr  is  selected  for  transmission, 

vector  S.  is  transmitted. 

1 

Si  — (^il#  s^2,...,s^^),  i— 0,1,..., M— 1 


If  we  choose  a convenient  set  of  orthonormal  functions 

{ 4>  ( t ) } the  signal  may  be  defined  in  this  signal  space  with 

N mutually  perpendicular  axis  labeled  <{>..  , „,...,  4>., . If 

1 Z,  In 


denotes  the  unit  vector  along  the  jt^1  axis, 
then  each  N tuple  above  describes  the  vector 


j = l,2, 


. , N 


s.  - sil4>1  + si202  +...+  siN*N 

N 

If  we  treat  our  image  data  as  M = 2 equally  likely  messages 
and  locate  these  signals  in  an  orthogonal  space,  i.e.,  ver- 
tices of  an  N dimensional  hypercube,  then  Wolzencraft  and 
Jacobs  [3]  prove  that  no  error  is  made  if  the  noise  vector 


n 7T  where  d is  the  distance  between  vertices,  and  in 
i 2 

fact  is  the  decision  boundary.  Now  for  the  assumption  of 
2 N0 

white  additive  noise,  a constant  term  — with  a variance 

N 9 

o'  = R(x)  = <5(t)  for  zero  mean  is  made.  Thus  the  joint 


density  runction  p is 
1 n 


P„(a)  = 


1_ 

(TrV 


n72 


-la  I VNf 


The  probability  of  error  is  then: 

1 


-y2/2 


p[E|m1]  = / 


d/ 2 


\^2tt 


VNc 


72 


where  $ is  the  erf  function  defined  by: 

oo  -y2/2 

(ot)  = --  - ■ - / e dy 

« a 

V 


The  error  probability  for  this  study  is  a given,  and  takes 

-2  - 3 -4 

on  the  values  of  p = 10  ,10  ,10  . If  we  choose  a random 

variable  V to  represent  the  number  of  erroneous  digits  in  a 
message  then  V takes  on  values  u = 0,l,2,...,n  with  proba- 
bilities : 


P(V  = u)  = ( n n ) pn  V 
n - M 


and  the  average  value  is: 


n 

E { V } = E UP{V  = p}=  nq 

U = 0 


Stated  simply  in  each  sequence  of  n binary  digits  we  can 
expect  | nq | to  be  altered  by  noise,  on  the  average.  The 
simulation,  therefore,  generates  an  average  of  j nq | altera- 
tions by  adding  white  noise  to  the  coded  bit  assignments. 

The  Discrete  Cosine  Transform  and  optimum  bit  allocation 
programs  were  combined  to  investigate  the  noise  effects. 

Each  sub-image  was  treated  as  a random  message  in  a binary 
symmetric  chann  1.  The  white  noise  generation  from  the  com- 
puter made  it  possible  to  randomly  change  message  bits  at  the 
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I 


given  threshold  corresponding  to  the  probability  of  error. 

The  results  indicated  a steep  rise  in  the  RMS  error  of 
the  image  as  a function  of  the  error  probabilities  (Figure 
5.1).  This  steep  increase  in  error  can  be  explained  by  the 
way  the  system  is  simulated.  If  eight  bits,  for  example, 
are  assigned  to  a component  by  the  coding  algorithm  there  is 
a possibility  of  256  "Max"  quantized  values  being  used. 

These  estimated  coefficients  are  not  actually  subjected  to 
the  error.  Instead,  an  index  is  transmitted  for  each  com- 
ponent indicating  which  level  of  the  256  should  be  used.  A 
small  change  in  the  predominant  bit  of  the  index  could  make  a 
gross  change  in  the  coefficient  level  used.  The  RMS  error  of 
the  reconstructed  picture  is,  therefore,  a function  of  the 
simulation  used  and  should  not  be  used  as  conclusive  evidence 
of  the  effect  of  errors  in  the  channel. 

Reasonable  specification  limits  in  terms  of  error  proba- 
“5  “6 

bilities  are  10  to  10  . Although  not  requested,  the  system 

-5 

was  simulated  at  10  probability  of  error.  A comparison  of 
RMS  error  with  and  without  channel  noise  resulted  in  the 
f ol lowing : 


Avg . bits/pel 
1 bit/pel 
. 75  bit/pel 
.5  bit/pel 


RMS 

No  Noise 
3 .6503 
3.9727 
4 .43664 


RMS 

White  Noise  p = 10 
3.6504 
3.9749 
4.4365 


It  may  be  concluded  therefore  that  except  for  err  • 

-5 

bilities  higher  than  10  , this  system  coding 

accurate . 


3.  DPCM 

A common  method  for  transmitting  i • 
moderate  compressions  is  Different  i 
(DPCM)  . In  this  approach,  th  !.•  • 


o in  o 

ltn  o 

O O r-i 


J0JJ3  swd 


function  of  error 


signals  samples  are  transmitted  rather  than  the  signals 
themselves.  This  has  been  adapted  for  use  in  the  transform 
domain.  This  method  of  transf orm/DPCM  encodes  differences  in 
transform  coefficients  instead  of  picture  element  amplitudes. 
It  is  anticipated  that  this  should  be  less  sensitive  to  noise 
than  the  straight  DPCM  predictive  schemes  and  achieve  an 
additional  1.5  compression.  Figure  5.2  illustrates  the  trans- 
form DPCM  approach.  The  compression  is  given  by: 

KNt 

CR  = N + (K-l)N'i 

where : 

K is  the  number  of  frames  being  considered  to  an 
error  specification 

Nt  total  number  of  bits  for  first  frame 

N.  number  of  bits  for  (K-l)  frames 

i 

The  original  frames  supplied  by  the  Air  Force  Avionics 
Lab  were  reduced  from  24  frames/sec  to  8 frames/sec  (a  3:1 
frame  rate  reduction) . A histogram  of  the  differences  in  the 
spatial  domain  and  frequency  domain  were  compared  (see  Figures 
5.3  and  5.4).  DPCM  was  then  applied  to  the  difference  images 
in  the  frequency  domain.  The  RMS  error  for  1 bit/pel  (6:1 
compression)  was  1.0012  for  the  first  frame  and  1.5033  for 
the  predicted  frame.  The  RMS  error  using  1 bit/pel  without 
prediction  was  only  1.2202.  This  indicated,  on  these  images, 
that  the  frequency  distribution  of  the  actual  image  was 
narrower  than  the  difference  image.  This  was  verified  by 
additional  histograms  (see  Figure  5.5).  This  infers  that 
not  all  gray  levels  are  being  utilized.  The  dynamic  range 
was,  therefore,  changed  by  equal  interval  quantization  and 
the  resulting  histogram  was  wider  than  previously  (Figure  5.6) 
and  the  RMS  errors  were  reduced  to  1.09  for  predicted  value. 
DPCM  could  then  be  used  efficiently.  Coding  the  differences 
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Figure  5.3.  Histogram  of  Spatial  Differences  for  DPCM 
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Figure  5.5.  Histogram  of  transformed  original 
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Figure  5.6.  Histogram  of  original  with  equal  interval 
quantization  to  32  levels. 
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is  efficient  and  produces  savings  in  code  only  if  the  original 
images  are  not  flat.  These  supplied  images  do  have  a flat 
spatial  property.  Therefore,  it  is  recommended  that  equal 
interval  quantizing  be  applied  to  utilize  the  dynamic  range. 

Twelve  frames  of  the  supplied  imagery  were  processed 
under  the  assumption  of  3:1  frame  rate  reduction  and  DPCM. 

The  DPCM  process  used  a C.R.  of  1.5  and  K=4  or  updating 
every  fourth  frame.  Figures  5.7,  5.8,  5.9  and  5.10  visually 
compare  these  frames  with  the  original.  Figure  5.11  illus- 
trates the  RMS  error  as  a function  of  frame  number.  This 
plot  is  actually  a time  sequence  showing  that  the  updated 
frames  have  much  less  error  than  those  frames  being  predicted. 
The  original  frame  to  frame  error  which  is  relatively  con- 
stant may  be  compared  to  the  reconstructed  frame  error  as  in 
Figure  5.12.  The  relationship  remains  linear  for  constant 
motion  of  the  aircraft  as  anticipated.  These  results  indicate 
that  with  a 12:1  compression  in  the  transform  domain,  a 50:1 
compression  is  feasible  for  the  system  concept.  Since  the 
correlation  based  on  f rame-to-frame  is  high,  indeed  almost 
the  identical  image,  perhaps  there  is  a more  efficient  way 
of  transmitting  differences. 

4 . Conditional  Replenishment 

Since  there  is  little  change  on  a f rame-to-frame  basis 
perhaps  the  only  changes  that  need  be  transmitted  are  the 
significant  changes.  This  approach  is  referred  to  as  con- 
ditional replenishment  and  would  achieve  more  efficient 
coding.  Of  course,  position  data  of  the  significant  change 
would  also  be  required.  The  question  then  arises;  what  is  a 
significant  change?  In  an  effort  to  extend  what  has  been 
reported  so  far  in  this  project  a suggestion  for  a signifi- 
cant change  criteria  is  made. 
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Relations 


Hypothesis:  A figure  of  merit  exists  which  relates 

change  in  a frame- to- frame  basis  in  terms  of  correlation. 

Define  the  set  x. , x_,...,x  to  be  random  variables  of  the 
1 2 p 

observation  vector  of  any  sub-image.  Let  the  observation 

t h 

vector  be  designated  by  Xm  where  m denotes  the  in  frame  and 
the  range  of  m is  1...M.  Capital  M is  the  total  number  or 
frames. 

Then: 


X = 
m 


The  covariance  of  X^  (the  observation  vector,  if  the  set  of 
random  variables  (x^,  x^f.-./Xp}  has  zero  mean  is: 


covariance  of 
the  m*'*1  frame 


= Cov (X  ) = E { X X T } = i E X , X , T 
m mm  N mk  mk 


where  N is  the  total  number  of  subimages. 

Let  the  preceeding  frame  be  designated  by  an  observation 
vector  the  covariance  of  this  frame  is 


covariance  of  _ _ n 

the  (M- 1 ) th  frame  = ^Vl*  = E{Xm-lVl}  = k . 


m-1  m-1  N k=1  (m-l)K  ( 


Now  define  the  sequence  observation  vector  as: 
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for  case 
of  two 
frames  = 


1 

/ °11 (m,m)  1 

2 

°11 (m,m-l) 
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a2  i > * 

p£(m,m) |_ 

2 

pp (m,m-l) 
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2 1 
all(m-l,m)  | 

2 

a (m-l) (m-1) 

I 

1 

\ 2 * 1 
' app(m-l,m)  1 

1 

2 

pp (m-l ,m-l) 

i 1 

i 

j Three  important  properties  of  this  covariance  of  time  varying 

errors  are: 

a.  The  random  processes  giving  rise  to  the  error  vectors  may 
or  may  not  be  stationary.  If  the  second  order  statistics 
are  changing  in  time  this  information  will  be  clearly  con- 
veyed by  the  successive  total  covariance  matrice. 

— »P 

b.  The  matrices  E { ^ ^ Xm—  j ^ ^or  * ^ 3 may  or  may  not  nuH’ 

These  matrices  form  the  off  diagonal  elements  of  covariance 
(S  S ) , so  if  observation  errors  of  any  one  frame  with  any 
other  frame  are  uncorrelated  these  terms  will  be  null.  If 
correlation  exists  some  or  all  may  be  non  null.  [Stage 
wise  correlation] . 

c.  The  main  diagonal  blocks  (note  single  frame  subscript)  may 

; or  may  not  be  uncorrelated.  If  they  are  uncorrelated  then 

we  have  locally  non-correlation.  As  an  example: 
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Cov  (S)  - 


where  a i 
we  would 


s a constant  then: 

have  a.  second  order  statistics  stationary 

b.  errors  are  stagewise  uncorrelated 

c.  errors  are  locally  uncorrelated. 


If,  however, 


Cov (S)  = 


,n 


20 


n l 


0 


L 


0 


i e 


i e 


n-l 


n-1 


,n-l 


,n-l  / 


n,n-l  are  two  consecutive  frames, 
then  we  have  (where  0 is  a real  constant) 

a.  non-stationary 

b.  stagewise  uncorrelated 

c.  locally  correlated 

Therefore  since  the  f rame-to-frame  correlation  is  higher  we 
would  expect  to  see  more  locally  uncorrelated  evidence  and 
high  stagewise  correlations.  Unless  a significant  change 
occurs,  then  stagewise  correlation  should  be  less. 

So  define  F a figure  of  merit 
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4 


F(m,m-1)  = 


trace  t n 

Tm-1 ,m 

'trace  i , m , trace  1 


where  m,m-l  denotes  two  frames. 

$ covariance  matrix  is  partitioned  form. 

See  Figure  5.13. 

This  process  may  also  be  applied  to  error  images  which 
are  made  up  of  the  differences  of  any  two  consecutive  frames. 

If  we  consider  any  one  frame  we  can  estimate  the  next 
frame  by  a prediction  filter 


Y = w(p)Y 
m+1  m 


where  Ym+^  is  the  predicted  observation 


W(p)  is  a transition  matrix  with  p-step  prediction. 
In  this  example  p = 1. 


This  filter  would  also  operate  on  the  error  images  to 
predict  the  next  error  image.  If  E?  represents  the  difference 

image  between  frame  m and  m-1  then  E = W(l)Ef  is  the 

estimate  of  the  next  error  image.  W(o)  matrix  = a zero  order 
predictor  would  be 


W (o)= 


so  that  Y = W(o)Y  . Each  of  the  new  covariances  may  be 
m m 

T 

defined  by  covariance  (S)  = E{S  S } and  the  estimated 
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covariance  (S)  = W(p)E{S  S }W  (p) . Therefore  in  conclusion 
as  each  frame  is  transmitted  a several  step  prediction  can  be 
used  to  predict  the  next  few  frames.  This  will  smooth  the 
image  transitions,  and  save  bandwidth  by  reducing  the  number 
of  frames  transmitted.  Further  the  figure  of  merit  for 
additional  frames  can  be  estimated  by  the  prediction  process. 
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SECTION  VI 


CONCLUSION.;  AND  RECOMMENDATIONS 
1 . CONCLUSIONS 

It  may  be  concluded  from  the  data  submitted  within  this 
report  that  the  Fast  Karhunen  Loeve  more  closely  approximates 
the  optimum  Principal  Components  than  any  other  fast  trans- 
form. This  transform  also  exhibits  an  adaptive  nature  since 
the  eigenvectors  used  for  the  basis  set  change  with  each 
image.  If  a class  of  images  which  have  nearly  the  same 
eigenvectors  is  found  the  implementation  would  be  as  rapid 
as  any  other  fast  transform. 

The  Discrete  Cosine  was  chosen  as  the  "best"  transform 
because  of  its  small  error  and  ease  of  implementation.  Hard- 
ware implementation  of  this  transform  may  be  feasible.  The 
Discrete  Linear  Basis  with  the  (8 *2 *8* 2)  direct  product 
implementation  is  nearly  as  accurate  as  the  Discrete  Cosine 
and  in  fact  more  feasible  for  hardware  implementation  since 
it  can  be  accomplished  with  integer  arithmetic.  The  small 
error  would  be  less  significant  than  the  savings  in  storage 
and  running  time. 

In  terms  of  the  optimum  bit  allocation  with  the  Discrete 
Cosine  it  appears  that  sufficient  operation  could  be  accom- 
plished with  the  noisy  channel  if  the  noise  is  held  to  less 
than  10  5 probability  of  error  for  a selected  transmission 
rate . 

From  a system  point  of  view  it  is  feasible  to  compress 
the  images  in  terms  of  50:1  with  reasonable  errors.  If 
additional  errors  may  be  tolerated,  or  flicker  with  frame 
rate  reduction  in  the  order  of  5 to  10:1.  The  implementa-  ' 
tion  of  the  hybrid  Discrete  Cosine/DPCM  seems  reasonable 
compared  to  the  required  storage  for  three  dimensional  trans- 
form approaches. 
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2 . RECOMMENDATIONS 


In  the  process  of  this  investigation  several  unresolved 
problems  exist  which  it  is  recommended  be  pursued.  The 
Fast  K-L  transform  basis  set  is  image  dependent.  The 
question  must  then  be  asked  how  much  error  exists  in  con- 
tiguous imagery  if  the  basis  set  is  not  changed  for  each 
frame.  Further  how  many  frames  could  be  used  or  is  this 
particular  transform  possible  for  an  entire  class  of  imagery 
such  as  desert  areas  or  forest  areas  where  the  background 
does  not  change  significantly. 

Since  the  Fast  K-L  transform  indicates  that  the 
optimum  number  of  layers  for  a fast  transform  is  two  for 
the  direct  product  implementation,  why  was  the  Discrete 
Linear  Basis  using  two  layers  poorer  than  the  direct  pro- 
duct of  8 and  2 resulting  in  four  layers?  The  question 
should  be  pursued  in  the  basis  of  the  optimum  generation  of 
a 16  x 16  basis  set  and  investigation  of  why  the  sequency 
property  is  lost.  If  this  happens  in  general  then  what 
ordering  of  the  basis  vectors  should  be  utilized. 

Further  improvement  of  images  under  compression  may  be 
possible  by  a permutting  process.  A brief  look  at  this  type 
implementation  revealed  it  would  be  necessary  to  transmit 
the  mean  of  each  sub-image  rather  than  the  mean  of  the  entire 
image.  Time  did  not  allow  a complete  analysis  of  this 
approach . 

Another  unresolved  problem  is  the  extension  of 
Haralick's  storage  savings  Discrete  Cosine  implementation 
efficiently  to  two  dimensions.  The  present  method  requires 
a subroutine  call  to  a Fast  Fourier  Routine  four  times  which 
increases  the  running  time.  An  efficient  method  by  calling 
the  Fast  Fourier  routing  only  once  for  the  forward  and  in- 
verse transforms  seems  reasonable  but  has  not  been  found. 
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Further,  recommendations  must  be  made  concerning 
the  conditional  replenishment.  With  the  tools  now 
available,  a sequence  of  frames  with  much  more  activity 
should  be  examined.  Also  built  into  our  suggested 
significant  change  criterion  is  an  overall  averaging 
effect.  This  could  possibly  be  examined  for  different 
size  subimages  or  expanded  so  it  is  possible  to  decide 
what  subimages  should  be  changed,  not  just  frames. 
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IMAGE  DATA  COMPRESSION:  I HE  INCOMPLETE  FAST  TRANSFORM 
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ABSTRACT 

One  frequently  used  image  data  compression  method  is 
based  on  tiansform  coding.  In  this  paper  we  show  that 
transform  coding  of  image  data  really  consists  of  applying 
a fast  transform  in  an  incomplete  fashion.  In  terms  of  RMS 
error,  the  best  transform  is  the  principal  components 
iKarhunen  Loevei  one.  We  show  that  under  certain  con- 
ditions there  exists  a fast  principal  components  transform. 

INTRODUCTION 

One  of  the  first  pattern  recognition  tasks  is  the  pre- 
processing of  data  to  normalize  and/or  reduce  dimension- 
ality.  Data  compression  is  a natural  place  to  look  for  in- 
formation preserving  procedures  which  reduce  the  number 
of  data  points  or  dimensions.  One  frequently  used  data 
compression  method  is  that  based  on  transform  coding  us- 
ing one  of  the  fast  transforms  such  as  F FT , FHT  or  DLB. 

In  this  note  we  illustrate  that  transform  coding  of  image 
data  really  consists  of  applying  a fast  transform  in  an  in- 
complete fashion.  In  addition  we  give  a semi  fast  algo- 
rithm for  the  KL  transform  of  an  isotropic  4x4  image  and 
discuss  the  additional  assumptions  for  the  fast  transform. 

Figure  I illustrates  the  general  idea  of  transform  cod- 
ing. The  image  is  partitioned  into  equal  sized  rectongu- 
lar  subimages,  say  M rows  by  N columns  each,  a fast  for- 
ward transformation  is  done  on  each  subimage,  and  com- 
pression is  achieved  by  selecting  only  those  transformed 
components  having  relatively  high  energy  (the  non-selecf 
ed  components  are  set  to  zero).  A reconstruction  of  the 
compressed  image  is  achieved  by  applying  an  inverse 
transformation,  subimage  by  subimuge,  using  a zero  value 
for  the  non-selected  transformed  components . From  our 
perspective  of  the  incomplete  fast  transform,  the  key 
details  of  the  transform  coding  technique  are  the  partition- 
ing of  the  image  into  subimages  and  the  application  of  a 
fast  transform  on  each  subimage. 

The  fast  transform  is  based  on  a direct  product  of  vectors 
concept  (see  Good,  1971;  Haralick  & Shanmugam,  1973). 
Let  x be  an  Mx  1 vector 


= (> 


M 


and  y be  an  Nxl  vector 

Then  the  direct  product  x*y  of  x with  y is  an  NMxl  vector 
and  can  be  defined  by 

/x  y)'=<xjyy  x,y2 x,yN *My, "mW 

To  see  how  this  relates  to  the  fast  transform,  let 

B,  = 1/iW  ’1 


ard 


viii)  (.?)(-?)! 


be  two  sets  of  basis  vectors.  Bj  is  the  2 x 1 set  of  Hada- 
mand  vectors.  B2  is  a 3 x 1 set  of  discrete  lineas  basis 
(DLB)  vectors.  Consider  the  set  of  vectors  which  results 
when  we  take  the  direct  product  5f  each  vector  in  B|  with 
each  vector  in  B2: 

/i\  ll 


There  results  a 6 x 1 set  of  discrete  lineas  basis  vectors. 
Furthermore,  as  shown  in  figure  2,  there  exists  a fast  way 
to  transfrom  any  vector  x by  the  basis  vectors  in  B3.  The 
reader  may  verify  that  the  4 x 1 Hadamand  transform  is 
defined  by  the  direct  product  of  B.  with  B^.  The  form  of 
the  fast  Fourier  transform  and  the  Slant  transform  are 
really  just  like  the  form  simple  fast  transform  shown  in 
figure  2 but  with  minor  variations.  In  the  F FT , before  most 
of  the  outputs  of  one  layer  can  proceed  to  the  next  layer, 
they  are  multiplied  by  a complex  number,  the  twiddle 
factor.  In  the  Slant  transform , before  two  of  the  outputs 
of  one  layer  can  proceed  to  the  next  layer,  they  are  put 
through  a 2 x 2 transformation.  The  FFT  and  Slant  imple- 
mentation are  illustrated  in  figures  3 and  4 respectively. 

Figure  5 shows  a simple  4x4  image  which  we  want  to 
transform  with  a two-dimensional  FFT.  The  definition  of 
the  general  two-dimensional  Fourier  transform  is  given  by 

N-l  N"]  0 * / 1 Kl 

-2j ' t .m  * kn  N 

Le  K(.>).  <1 

k=0  l *=0 

Rearranging  the  summation, 

■N-l  ?*  ™ 

E±i—  m 

e N x ( • ,k  ) 

.-0 

Ihus  we  see  that  to  do  a two  dimensional  transformation 
we  need  only  do  a one"dimensional  transformation  for  each 
row  of  x and  after  all  rows  have  been  transformed  we  do  a 
one"dimensional  transformation  for  each  column.  The  two 
dimensional  transformation  on  the  N^  components  of  x has 
no  more  than  twice  the  number  of  operations  as  the  corre- 
sponding one  dimensional  transformation  on  the  N rows  or 
column  of  x.  Figure  6 illustrates  an  implementation  for  the 
fast  two  dimensional  transformation  of  x. 

The  row"wise  then  column-wise  transformation  is  not  the 
only  way  the  two-dimensional  fast  transform  can  be  imple- 
mented. Rewriting  the  two-dimensional  transform  to  look 
like  a four“dimensional  tiansform 
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Now  it  appears  that  we  pick  every  M point  on  a row 
and  M* ' poi^t  on  a column  and  apply  a two-dimensional 
transform  on  them.  This  amounts  to  taking 


transforms.  Then  multiply  through  by  twiddle  factors  and 
tal<e  M*  more  transforms  on  the  corresponding  points  of  the 
transform  just  taken.  Figure  7 illustrates  this  implementa- 
tion form  for  the  two  dimensional  FFT. 

Suppose  now  we  permute  the  pixels  of  the  4x4  image, 
as  shown  in  figure  8,  and  consider  taking  the  two- 
dimensional  FFT  of  the  permuted  image  (implemented  as  in 
figure  7 1.  What  happens  if  instead  of  carrying  through 
with  the  four  layer  FFT  on  the  permuted  image  we  only 
carry  through  f or  two  layers?  Figure  9 shows  the  result- 
ing incomplete  fast  transform  on  the  permuted  image . It 
appears  that  the  incomplete  transform  on  the  permuted 
■ mage  is  identical  to  the  transform  obtained  by  partition- 
ing the  4x4  image  into  4 2x2  subimages  and  taking  an 

FFT  on  each  subimage.  It  should  be  clear  from  the  form  of 
equation  2)  that  this  is  the  general  situation  for  the 
Fourier  transform.  From  the  general  form  of  the  simple 
fast  transform  which  is  based  on  the  direct  vector  product 
of  basis  vectors,  from  some  basis  sets,  it  should  also  be 
clear  that  an  incomplete  fast  transform  on  the  entire  image 
corresponds  to  permuting  the  pixels  (this  allows  the 
identity  permutation),  partitioning  the  image  into  sub- 
images,  and  taking  a complete  transform  on  each  subimage. 

COMPOSITE  MATRICES 

The  purpose  of  the  transform  coding  compression 
techniques  is  the  reduction  of  dimensionality  while  pre- 
serving data  structure.  When  mean  squared  error  is  the 
criteria  used  in  judging  how  well  data  structure  is  pre- 
served, the  optimal  transform  technique  is  the  principal 
components,  or  by  another  name  the  Karhunen  Loeve, 

KL)  transform.  Assuming  the  image  has  zero  mean,  the 
basis  vectors  of  this  transformation  a e given  by  the 
eigenvectors  of  the  autocovariance  matrix  for  the  set  of 
subimages  into  which  the  image  is  partitioned . Thus  in 
order  to  do  a good  jot  of  data  compression  we  must 
select  a fast  transform  technique  whose  basis  vectors  are 
the  eigenvectors  of  matrices  similar  to  the  form  of  the 
autocovor iance  matrix  of  the  image.  In  this  section  of 
the  paper  we  explore  the  form  of  the  autocovariance 
matrix  and  the  kinds  of  matrices  which  the  fast  transforms 
diagonalize.  % 

We  begin  with  some  examples.  Each  of  these  examples 
is  alike  in  the  sense  that  the  eigenvectors  depend  only  on 
the  form  of  the  matrix  and  not  on  any  of  the  values  the 
elements  of  the  matrix  might  take. 


Notice  that  in  each  of  these  examples,  the  eigenvalue 
corresponding  to  an  eigenvector  is  easily  obtained  os  the 
dot  product  of  the  eigenvector  with  the  first  row  of  the 
matrix  e .g. 


/I  1 1 1\ 

/°\ 

yO*b+C»d  \ 

l!  V3'1'3'1 

M 

fa*  b 3-c  '3-d\ 

\|  -i  1/ 

c J 

1 a-b-c*  d J 

\l  -3  3 -r 

Vd/ 

^a-3b*3c-d  / 

This  occurs  because  the  first  component  of  each  eigenvector 
is  1 . 

From  this  small  set  of  2nd,  3rd,  and  4th  order  matrix- 
eigenvector  forms  we  can  construct  larger  order  matrix  - 
eigenvector  forms  in  a way  consistent  with  the  fast  trans- 
form technique.  Consider  for  example  a composite  matrix 
of  the  form 

H2 

H2  HrH2'H3 


where  Hj,  H~,  and  H~  all  have  the  same  eigenvectors 
(they  commufe)  and  are  of  the  form 

n 

The  eigenvectors  of  A are  of  the  form  x*y,  the  direct 
product  of 


©■ 


or  eigenvector  of  a motrix  of  the  forn 
b 


ith 


(b  o-b+c  b) 

\c  b a/  , 


on  eigenvector  of  a matrix  of  the  form 
id  el 


See  Afriat,  (1954),  and  Wil I iamson,  (1931),  Why  this 
must  be  so  is  easily  illustrated  in  our  example.  Consider 


A(x.y)  = 


HfH2tH3 


/x)H]ytx2H2ytx3H3y  \ 

x | H2y+x2<H  ]-H2+  Hjly+Xsh^y) 
\x)H3ytx2H2y+x3H|y  / 


/ o b c,  ,1  1 K /)  1 1,  ,otbtc  0 0 ^ 

b a-b*c  bj|  1 0-2)  1 0-21/  0 a-c  0 ) 

'c  b a'\|-l  1/  \l-l  1 / V 0 0 a-2btc' 


(a  b c dV  / 1 I I 1\  / 1 1 I li  /otbtc+d  S 

d o b cl/l  j “I  “I  1 _/  1 1 “j  I/  <*jb-c-jd  \ 

c d o b II  I -1  111  I 1 -1  1 -1  II  a-b.c-d  } 

b c d o/  \ I -j  i j / \ 1 -j  — 1 j / \ a-jb-ct  jd/ 


Figure  10  illustrates  some  more  examples. 


,n,ytx2n2ytx3n3y 

,n2y+x2(n,y-n2ytn3y)tx 

|n3^K2n2y+  x3nly 

lVx2ryx3"3  \ 

ln2+*2^nl~n2*  n3 ' v x3n2 1 
|n3*  *2n2*  x3nt  ' 
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But  since  x 

/ a b c 

( b a~b*  c b 
' c b o 


■ /">  n2  "3\ 

I "2  nl  ryn3  j 
\n3  n2  n|/ 

is  on  eigenvector  of  any  matrix  of  the  forn 

, /"l  °2  n3V 

; ( n2  Wn3  n2  ) x • 

\n  n n / 


Hence, 

A (x  • y)  = (Xx]-y  = > (x-  y). 

Thus  if  y is  on  eigenvector  of  Hj,  H^,  and  so  that 
H^y  - njy*  H^y  - n2y,  and  H^y  n~y  and  x is  an  eigen- 
vector of 


'nl  n2  n3 

B = ( n2  nrn2+n3  n2 

\n3  n2  nj 


the  matrix  B having  the  same  form  as  A except  that  the 
corresponding  eigenvalues  appear  in  place  of  the  H 
matrices,  and  x has  corresponding  eigenvalue  A,  then 
x y is  an  eigenvector  of  A,  with  eigenvalue  A. 

In  this  example  the  eigenvectors  of  A are 


and  they  correspond  to  the  fast  transform  implemented  as 
the  direct  product  of  the  vectors  in 


with  those  in 

!(!)(.!)!. 

The  matrix  A has  the  general  form 


/a  b c d e f\ 

b a d c f e \ 

c d a-c+e  b“d*f  c d 
d c b~d*f  a~c  + e d c I 

e f c d a b / 

\f  e d c b a / 

and  has  eigenvalues  corresponding  to  the  6 listed  eigen- 
vectors given  respectively  by 

IoHj’c fd+e^f),  (a"b4c“d *€“0,  (a+b“c~f) 
(a-b~e+f),  (o+b~2c**2d+*+f),  and  (a“b_2c  42d+e“f) . 


PRINCIPAL  COMPONENTS  TRANSFORM 

The  fast  transform  question  is  how  close  con  we  find  a 
composite  matrix  to  the  general  form  of  a autocovariance 
matrix.  If  we  can  find  a composite  matrix  similar  in  form 
to  the  outocovariance  matrix,  then  we  would  expect  that 
the  easily  computea  eigenvectors  of  the  composite  form 
would  be  similar  to  the  eigenvectors  of  the  autocovariance 
matrix  and  that  the  transformation  of  the  image  by  these 
easily  computed  eigenvectors  (which  have  a direct  product 
form)  can  be  implemented  as  a fatt  transform. 


Figure  11  illustrates  the  general  form  for  the  16  x 16 
autocovariance  matrix  of  a 4 x 4 image  and  table  1 lists 
which  resolution  cells  in  the  4x4  image  the  letters  of 
the  covariance  matrix  relate  to.  The  partitioning  of  this 
matrix  as  16  4x4  submatrices  yields  submatrices  of  two 

general  forms.  The  first  form  is  that  of  the  diagonal  4x4 
submatrices.  The  second  form  is  that  of  the  off-diagona! 
4x4  submatrices,  which  appear  as  their  own  tranposes  on 
one  side  of  the  diagonal.  In  general,  these  forms  do  not 
commute  and  the  autocovarionce  matrix  is  not  a composite 
matrix.  Hence  the  eigenvectors  cannot  be  expressed  as  a 
direct  product  of  smaller  dimensional  vectors  and  there 
does  not  exist  a simple  fast  transform  implementation  for 
them. 

Under  the  assumption  that  the  image  is  isotropic,  the 
autocovariance  matrix  simplifies  somewhat  as  illustrated  in 
figure  12.  This  simplification  might  do  some  good  since 
when  the  autocovariance  matrix  is  partitioned  into  4x4 
submatrices  all  the  matrices  hove  the  some  form: 


Unfortunately,  unlike  the  example  matrix-eigenvectors 
illustrated  earlier,  the  eigenvectors  for  this  matrix  do  de- 
pend on  the  values  of  the  elements  in  addition  to  the  form 
of  the  matrix.  Thus  although  the  autocovariance  matrix 
for  the  isotropic  image  can  be  partitioned  into  matrices 
of  the  same  form,  these  matrices  do  not,  in  general,  have 
the  same  eigenvectors. 

However,  there  still  is  enough  structure  in  the  matrix 
for  us  to  capitalize  on:  the  matrix  is  invariant  under  cer- 
tain permutations  of  its  rows  and  columns.  The  theory  of 
permutation  invariance  is  as  follows.  Suppose  Ax  = \x  so 
that  x is  an  eigenvector  of  A and  * is  its  corresponding 
eigenvalue.  Suppose  P is  an  permutation  matrix  and  that 
A is  invariant  under  P;  that  is,  A=P'AP.  Now  consider 
the  eigenvectors  of  P'AP.  Since  x was  an  eigenvector  of 
A and  A = P'AP  we  must  also  have  P'APx -Ax.  But  for  a 
permutation  operator  P“  ’ P'  so  that  A (Px)=(Px).  Hence, 
Px  must  also  be  an  eigenvector  of  A having  corresponding 
eigenvalue  A . 

There  are  two  permutations  under  which  the  autocovar- 
iance matrix  of  the  4x4  isotropic  image  is  invariant.  They 
are  ,0000000000000001 
/ 00000000000000 10 
/ 0000000000000100 
0000000000010000 
0000000000100000 
0000000001000000 
p _ 0000000010000000 
1 0000000100000000 
0000001000000000 
0000010000000000 
I 0000100000000000 
\ 0001000000000000 
\ 00 10000000000000 
\0 1 00000000000000 
100000O000000000 

Both  Pj  and  P2  are  symmetric  and  are  their  own  inverses. 
The  autocovariance  matrix  is  invariant  under  P^  because 
it  is  symmetric  about  both  diagonals.  It  is  invariant  under 
P~  because  each  of  the  4x4  submatrices  of  the  autocovar- 
iance matrix  is  symmetric  about  both  diagonals. 


,0001000000000000 
/ 001 000000000000c \ 

1 0100000000000000  \ 
1000000000000000 
0000000100000000 
0000001000000000 
0000010000000000 
0000100000000000 
0000000000010000 
0000000000100000 
0000000001000000 
0000000000000001 
1 0000000000000010 
10000000000000100/ 
'0000000000001000' 
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Pj  and  will  tell  us  something  about  the  form  of  the 
eigenvectors.  We  will  consider  P|  first.  Suppose  x is 
an  eigenvector  of  the  autocovariance  matrix 


(> 


16 


) . 


(1,  x^A 


Vxi  • 

( 1,  »)5/x,6 


l^l* 

~ 1 4^x 1 6 


<l/xl6>'' 


Notice  that 

This  impl ies  that 
Talcing 
we  see  that 
Talcing 
we  see  that 


x!6 

*1 


XJ 

*16 


16 


=±xi . 
=xi , 
"17-n 
"*1  , 
"x17-n. 


Thus  the  eigenvectors  have  two  forms:  even  and  odd.  And 
the  kinds  of  matrices  they  diagonalize  must  be  bisymmetric 
(Shanmugam  & Haralick,  1973). 

In  our  analysis  of  the  invariance  of  the  eigenvector  to 
?2  we  need  only  consider  the  vector  y of  the  first  eight 
components.  Compare  y and  y with  its  first  four  and  last 
four  components  reversed.  Notice  that 
/4_yi 


so  that 


When 


=±yt 


y4  =yl 

we  get  the  form 

0,y7/yv  y2/y, , y5/y, , yb/yy . yb,y} , y5/y] ) 

When 

y4  = _yl 

we  get  the  form 

(A  y-j/yy  -y-j/yy  y^y\-  Y(/y\>  -Yy/yy  *>Jy\> 


This  implies  that  there  are  four  sets  of  eigenvectors,  each 
set  having  four  vectors  of  the  form 

a 


Then  Pjx  is  also  an  eigenvector  of  the  autocovariance  matrix. 
Writing  x and  Pj*  witn  first  component  1,  we  can  set  them 
equal  since  they  both  have  the  same  corresponding  eigen- 
value. Hence 


Figure  13  illustrates  the  fastest  way  to  implement  the 
dot  product  of  a vector  with  each  of  the  four  eigenvectors 
in  each  set.  Compared  with  the  general  16x1  fast  trans- 
form which  requires  128  add  and  multiply  operations,  the 
implementation  shown  for  the  eigenvector  of  the  16x16 
autocovariance  matrix  requires  192  operations. 

Our  next  question  must  be:  how  generalizable  are  these 
results?  In  the  general  isotropic  case  for  the  N row  by  M 
column  subimoge,  the  autocovariance  matrix  con  be  parti- 
tioned into  N2  M x M submatrices  each  of  the  same  Toe- 
plitz  form.  Although  the  ToepUtz  form  is  the  some,  the 
eigenvectors  depend  on  the  values  in  the  Toeplitz  matrix 
form  os  well  as  on  the  form  itself.  Hence,  we  cannot  say 
that  the  N?  submatrices  have  the  same  eigenvectors. 

If  we  are  willing  to  make  one  more  ideal  izotion  - that 
the  submatrices  differ  by  a multiplicative  constant  - then 
that  would  imply  that  they  all  have  the  same  eigenvectors. 
To  say  that  the  submatrices  are  proportional  is  to  say  that 
whatever  covariance  relationships  a row  has  to  itself,  the 
covariance  relations  lip  of  one  row  with  another  row  is  the 
same  modulo  a multiplicative  constant.  It  is  an  assumption 
not  as  strong  as  the  Is*  order  Markov  dependence  assump- 
tion often  made.  Now  the  existence  of  a fast  transform 
follows  from  our  previous  discussion  of  composite  matrices. 
The  number  of  operations  involved  in  the  fast  implementa- 
tion would  be  N • for  the  first  layer  and  M • N?  for  the 
second  layer.  This  is  a total  of  (M*  NINM  operations 
compared  to  (NM)2  for  a brute  force  implementation. 

CONCLUSIONS 

We  have  discussed  the  general  fast  transform  from  the 
perspective  of  the  vector  direct  product  and  have  shown  the 
relationship  between  data  compression  transform  coding  and 
the  incomplete  fast  transform.  We  have  illustrated  that  the 
autocovariance  matrix  for  an  isotropic  4x4  image  yields 
eigenvectors  which  have  a semi-fast  transform.  We  have 
discussed  what  assumption  beyond &otropy  is  needed  for  the 
fast  KL  transform  to  exist  in  the  general  case.  In  a future 
paper  we  will  discuss  the  suitability  of  these  additional 
assumptions . 
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Figure  1"  Fost  implementation  for  Eigenvectors 


Figure  12.  Isotropic  autocovar  ionce  form 
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Derivation  of  Optimum  Variable  Word  Length  Code 


Given  that  a total  number  of  projections  are  to  be  considered  CM  PR)  the 
truncation  error,  by  using  only  p projections,  where  p projections  (or  coefficients) 
are  selected  by  first  ordering  the  variances  of  the  total  projections,  where  p < NPR 

NPR  P 

T = £ VAR  (I)  - £ VAR(I)  (1) 

1=1  1=1 


where  VAR(I)  - Variance  of  the  1^  component 

= E { (Xj  -nij  2J 

with  Xj  is  the  Itb  data  component 

th 

m.  is  the  mean  of  the  I component 


The  total  quantization  error  for  the  image  is  given  by: 

Q = £ VAR(I)  C"2  Nbits^ 

1=1 


(2) 


For  p Joel  Max  quantizers 


C = 1.78  for  normal  distribution  of  any  one  component 
Nbits  = bit  assignment  for  the  component. 


Combining  equation  (1)  and  (2)  the  total  system  error  is: 

“2  Nbits(I)j 


NPR  p p 


E = £ VAR(1)-  £ VAR(I)  + £ VAR(1)  (C 

1=1  1=1  1=1 


Truncation  Error 


min.  variance  quantization  error 


or  E = L - £ VAR(l)  (l  - C"2  Nbits(I)j 

1=1 


(3) 
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The  total  error  is  therefore  a function  of  p,  the  number  of  projections  retained 
and  Nbits,  the  number  of  bits  assigned  to  the  p*b  quantizer.  The  truncation  error 
decreases  with  increasing  p,  while  the  quantization  error  increases  with  increasing 
p.  It  is  necessary  to  find  the  value  of  p and  resulting  bit  assignments  for  the  set 
of  |Nbits(I)j  P wh  ich  minimizes  the  total  error  E. 

Given  that  some  total  number  of  bits,  say  Mbits  is  to  be  assigned  to  p Joel  Max 
quantizers 

P 

i.e.  ^ Nbits  (I)  = Mbits 

1-1 

determine  the  optimum  number  of  projections  p (ordered  according  to  decreasing 
variances)  and  corresponding  bit  assignments  of  jNbits  (I)j  j3  ^ that  minimizes 
the  total  system  error 

E = T + Q 

Restated  in  a better  form: 

| L - £ VAR(I)  (1  - C'2  Nbits!l)) 

( 1=1 

NPR 

y.  VAR(I)  and  by  definition  is  a positive  constant 
1=1 

P 

Under  the  constraint  y Nbits(l)  = Mbits 
1=1 

Since  L is  a constraint: 

l > y VAR  (I)  (l  - c”2  Nb,t(1>) 

1=1 


min 

(p,  Nbits  (1)) 
where  L = 
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with  the  equality  p = NPR  and  Mbits  -*•  An  equivalent  problem  is  to  maximize 
the  equation  on  the  right  or: 


(p,  Nbits(I)  {EV^I)  (1  - C'2  Nbi,s(l>) 

P 

Subject  to  ^2  Nbits(I)  = Mbits 

1=1 

Considering  Nbits  and  p projections  as  continous  variables  and  using  Lagrange 
multipliers  A we  have 

[g  var(i)  |i-c-2Nbi*">)  * xj>bMi)]  =o 

VAR(l)(l-c'2Nbl,i(l)l  + =0 

dp  1 1=1  1=1  ) 

p 

y;  Nbits(I)  = Mbits 

1-1 

Equation  (4)  iepresents  p equations  while  (5)  and  (6)  represent  one  equation  each. 
There  is  then  p + 2 equations  and  p + 2 unknowns 

(p.  A,  | Nbits(l)l  P ) 

1 ) 1=1 

Taking  the  partial  derivations  of  (4)  and  noting 

In  C"2Nbit*W  -2Nbits(l) 
e = C 

we  have  VAR(J)  (2)  In  c"2NbUs(J)  + X = 0 (7) 
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(4) 


'J  = 1 


(5) 


(6) 


likewise  if  (5)  is  approximated  by 


_d_ 

dp 


f VAR(I)  (l  - C_2NbltS  (1))  d I + X f 
J 1 J\ 


by  fundamental  theorem  of  Calculus: 


Nbits(I)  d I 


VAR(P)  (l  ~ c“2Nbits(P)J  + X Nbits(p)  j - 0 
or  VAR(p)  (l  -C-2Nibts(p)j  + x Nbits(p)  = 0 


Solving  equation  (7)  for  Nbits 

2 VAR(J)  In  C_2Nblt{J)  + X = 0 


/_~2Nibts(J) 
n L 


-X 

2 VAR(J) 


= 0 


(8) 


Taking  log£  we  have: 


f.  ^”2Nbits(J) "1  _ , 

r 

i 

'ogc 

J " myc 

[2  VAR(J)J 

2 Nbits  (J)  log( 

c H 

_ 1 

r 

1 

— 1 

uyC  L 2 VAR(J)  j 

-logc 

r 

- X n 

2 Nbits(J)  = 

[2  VAR(J). 

. i r 

- X 

lo9C 

1 

nC  J 

- °9C  [ 

2 VAR(J)  In  c 

Nbits(J)  = 

- 1/2  logc 

- — 

X 

1 

= - 1/2  logc 

r 

- 2 VAR(J)  1 

n C-l 

L2  InC  VAR(J) 

1/2  logc 

r 

X 

r 

■1 

L 2 In  C VAR(J)  J 
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Nbits(J)  - 

1/2  logc 

L-  ; . 

(9) 

for  A = 

ln  C 

Nbits(J)  = 

1/2  log^- 

p -2  A VARIJ ) -| 

X 

Substituting  this  result  into 

(6) 

Mbits  = 

P 

^2  Nbits(J) 

Mbits  = 


P r 

£ 1/2  logc 

1=1  L 


-2  A VAR(J) 


1/2  log^ 


r -2A  IP 


II  VAR(J) 
-I  !=:! 


(10) 


Now  we  must  solve  (10)  for  X 
-2A  1 P P, 


mp,:,  ™<"J 


= c 


2Mbits 


r -2A  ~ip 

L x . 


m - 


.2  Mbits 


il  VAR(l) 

1-1 


2Mbits 


ll/p 


II  VAR  (I) 

1=1 


= -2A 


^-2Mbits 

T 

II  VAR(l) 

L 1-1 
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Substituting  into  (9) 


Nbits(J)  = 1/2  logc 


-2A  VAR(J) 


-2A 


^2Mbits 


P 

11  VAR(l) 
L 1=  l 


P 


Nbits(J)  = 1/2  logc 

VAR(J) 

£.2Mbits 

1/p 

P 

n VAR  (I) 
Ll=l  J 

This  result  shows  that: 


Nbits(J) 

= 1/2  log _ (VAR(J))  + Mbits 

L p 

]_ 

2p 

p 

1 ogr  il 
j_l=l 

VAR(I) 

Nbits(J) 

1/2  log  VAR(J)  + Mblts 

P 

1 

w 

P 

E l09c 

i=i 

VAR(I) 

Nbits(J) 

1/9  [ ,09C  VAR(J)  In  C ] 

Mbits 

..if 

logc 

VAR(I)  In  C 

L In  C J 

P 

29  h 

In  C 

but  log^ 

VAR(l)  In  C = In  VAR(I) 

Nbits'J) 

~ In  VAR(J)  + Mblts  - 
2 A P 

P 

— E 

2Ap  i=i 

In  VAR(I) 

(ID 

Evaluating  at  J = p and  substituting  into  (8)  iepeating  Eq.  (8) 
VAR(p)  - VAR(p)  C_2Nbits(p)  + X Nbits(p)  = 0 


substituting  for  last  turn  on  left 


£-2Mbits 

P 

In  VAR(p)  + — ^blts  - -L  V In  VAR(I) 

P 

IF  VAR(I) 

p P M 

1=1 

r p 

FI  VAR  (I) 

1=1 

^2Mbits 


1 1/P 


In  VAR(p) 


2A  Mbits 

+ 

P 


-5-  £ InVAR(I) 

H 1=1 


Substituting  for  second  term  on  left  of  Eq.  (8) 


£-2Nbits(p) 


,~2Mbifs 

1/p 

VAR(p) 

v». 

P 

n VAR(l) 

- 

L i=i  J 

Resulting  in 


VAR(p) 

VAR(p) 


£-2Mbits 

T 

n VAR(l) 

Li=i 


p 

n 

1=1 


VAR(l) 


C2Mbits 


Vp 


Eq.  (8)  then  becomes: 


VAR(p)  - 

- p 

FI  VAR(I) 
1-1 

1/P 

1 + |n  VAPfol  1 Mbits 

1 p 

- var(I) 

q 2Mbits 

r 

p 1=1 

term  1 term  2 


The  value  of  p satisfying  Eq.  (12)  and  bit  assignment  by  (1  I)  represent  the  approximate 
solution  to  the  minimization  described  earlier. 
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I 


Now  given  the  N x N transformation  T,  the  transform  vector  y is  created  by  the 
transform  encoder.  The  number  of  bits/ vector  element  (the  rate  R)  is  specified  thus: 


Mbits  = N-R 

Eq.  (12)  is  then  used  to  determine  the  value  p which  most  closely  satisfies  the  equality 


NOPT  = min 

l<p<NPR 


1 

VAR(p)  - 

r p i 

n VAR(l) 

1=1 

1/P 

^-2Mbits 

- 

1 + In  (VAR(p))  + 


2A  Mbits  1 


£ In  VAR(I) 


P 1=1 


(13) 


The  values  of  the  Nbits(l)  generated  are  not  necessarily  integers.  The  integers  must 
therefore  be  chosen  by  some  rule.  This  was  accomplished  by  rounding  off  each  Nbits(l) 
to  the  nearest  integer.  If  the  resulting  rate  R exceeds  or  is  less  then  that  specified 
then  one  bit  is  added  or  subtracted,  one  at  a time  for  minimum  error.  It  is  this 
rounding  off  procedure  which  will  oe  replaced  by  the  Huffman  Coder. 
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APPENDIX  III 


APPENDIX  III 


Theorem  2 

N 

Let  x^ , x2,...,xk  be  given  vectors  in  R . Let 
N 

P : R + V be  an  orthogonal  projection  operator  onto  the  sub- 

K 

space  VM,  spanned  by  the  M eigenvectors  of  £ x^x^  with 

k=l 

K 

largest  eigenvalues.  Let  = £ (x, -Px,  ) ' (x  -Px  ) . Let 

X ^ K K K K 

i ^ n 

z = tt  Ex,.  Let  P*  : R ■+  V*  be  an  orthogonal  projection 

K k=l  k „ - 

operator  onto  the  subspace  V*  spanned  by  the  M eigenvectors 
K 

of  E (x,  -z)  (x,-z)  ' with  largest  eigenvalues.  Let 
k=l  K K 

K 

= E ( (xk~z) -P* (x.-z) ) ' ( (xR-z) -P* (xk~z) ) . Then 
k=l 

2 2 

£1  C2 ' 

Proof:  Simplifying  and  we  obtain 


e?  = Trace  ((I-P)  E x,x  ') 
1 k=l  k k 


ci  = Trace  ((I-P*)  E (xk~z) (x^-z) ' ) 

k=l 


But,  E (x  -z)  (x,  -z)  ' = E x,x  ' - E x,  z'-z  E x,  +Kzz 
k=l  k k k=l  K K k=l  k k=l  k 


= E x,  x.  ' - Kzz  ' . 
k-i  k k 


Hence,  e?  = Trace  l(I-P)  ( E (x,  -z)  (x,  -z)  1 + Kzz') 

1 V k=l  k 
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/ K \ 

C2  = Trace  f ( I-P* ) ( E (x.  -z)  (x,  -z)  ' 

2 \ k=l  * k / 


From  principal  components  we  must  have 


Trace  (I-P*)  E (x,-z)  (x,  -z)  ' < Trace  I (I-P)  E (x,  -z)  (x,  -z)  ' 


k=l 


k=l 


Theref 


ore,  c2  - el  _ Trace  ((I-P)Kzz'),  or 


z 2 < e2  - Kz ' (I-P) z . 

Since  (I-P)  is  positive  semi-definite,  z'(I-P)z>0, 
Consequently,  e2  < - Kz ' ( I-P) z<e 2 . 


Theorem  2 has  shown  that  if  we  translate  the  data  by 
its  mean,  the  squared  error  for  an  M-dimensional  projection 
is  less  than  without  any  translation.  We  now  show  that  no 
other  translation  is  better  than  the  mean. 


Theorem  3 

N 

Let  x^,  x2,...,xk  be  given  vectors  in  R . Let 
P : RN  -*■  RN  be  the  orthogonal  projection  operator  of  rank  M 

cl 

which  minimi zes 


e2 (a) = 


K 

E 

k=l 


(I"Pa} (xk_a) • 


Then,  e2(z)<e?(a)  where 


x 


k- 


Proof:  consider  e2  (a)  , 

K 

e2(a)=  E (x.  -a+z-z)  ' (I-P,)  (x.  -a+z-z) 
k=l  k a k 
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K 

= Z (x.-z)'(I-P  ) (x.  -z)+2  (x.  -z)  • (I-P  ) (z-a)  + (z-a)  ’ (I-P  ) (z-a 

i ■»  K.  3.  K K 3 


K 

But  E (x,  -z)=0;  hence, 
k=l  K 

K 

t2(a)  = E(x,  -z)  ' (I-P  ) (x,  -z)+  K(z-a)  1 (I-P  ) (z-a) 

ci  K cl 

K 

Now  t2iz)  = E (x,  -z)  ' (I-P  ) (x,  -z)  . By  principal  components, 
k=l  K z K 

K K 

Z (x,  -z)  ' ( I-P  ) (x,  -z)  ^ Z (x,  -z)  ' (I-P  ) (x  -z)  . Therefore, 
k=l  z k k=1  K a k 

£ 2 (z)  < e 2 (a)-K(z-a)  ' (I-P  ) (z-a) . Since  (I-P  ) is  positive 

a cl 

semi-definite,  (z-a)' (I-P  )(z-a)  > 0.  Consequently, 

cl 

e2 (z>  < e2 (a) . 
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